AWS Specific code
Installing libraries
!pip install tensorflow opencv-python pillow scikit-learn
Requirement already satisfied: tensorflow in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (2.16.2) Requirement already satisfied: opencv-python in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (4.11.0.86) Requirement already satisfied: pillow in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (11.1.0) Requirement already satisfied: scikit-learn in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (1.6.1) Requirement already satisfied: absl-py>=1.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.1.0) Requirement already satisfied: astunparse>=1.6.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.6.3) Requirement already satisfied: flatbuffers>=23.5.26 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (25.2.10) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.6.0) Requirement already satisfied: google-pasta>=0.1.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.2.0) Requirement already satisfied: h5py>=3.10.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.12.1) Requirement already satisfied: libclang>=13.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (18.1.1) Requirement already satisfied: ml-dtypes~=0.3.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.3.2) Requirement already satisfied: opt-einsum>=2.3.2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (21.3) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (4.25.6) Requirement already satisfied: requests<3,>=2.21.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.32.3) Requirement already satisfied: setuptools in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (75.8.0) Requirement already satisfied: six>=1.12.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.17.0) Requirement already satisfied: termcolor>=1.1.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.5.0) Requirement already satisfied: typing-extensions>=3.6.6 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (4.12.2) Requirement already satisfied: wrapt>=1.11.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.17.2) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.70.0) Requirement already satisfied: tensorboard<2.17,>=2.16 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.16.2) Requirement already satisfied: keras>=3.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.8.0) Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.37.1) Requirement already satisfied: numpy<2.0.0,>=1.23.5 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.26.4) Requirement already satisfied: scipy>=1.6.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (1.15.1) Requirement already satisfied: joblib>=1.2.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (1.4.2) Requirement already satisfied: threadpoolctl>=3.1.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (3.5.0) Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from astunparse>=1.6.0->tensorflow) (0.45.1) Requirement already satisfied: rich in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (13.9.4) Requirement already satisfied: namex in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (0.0.8) Requirement already satisfied: optree in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (0.14.0) Requirement already satisfied: charset_normalizer<4,>=2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (3.4.1) Requirement already satisfied: idna<4,>=2.5 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (1.26.19) Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (2025.1.31) Requirement already satisfied: markdown>=2.6.8 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (3.7) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (0.7.2) Requirement already satisfied: werkzeug>=1.0.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (3.1.3) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from packaging->tensorflow) (3.2.1) Requirement already satisfied: MarkupSafe>=2.1.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from werkzeug>=1.0.1->tensorboard<2.17,>=2.16->tensorflow) (3.0.2) Requirement already satisfied: markdown-it-py>=2.2.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from rich->keras>=3.0.0->tensorflow) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from rich->keras>=3.0.0->tensorflow) (2.19.1) Requirement already satisfied: mdurl~=0.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->keras>=3.0.0->tensorflow) (0.1.2)
import os
os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0"
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
!nvidia-smi
Sat Mar 29 04:29:17 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 |
| 0% 30C P8 16W / 300W | 1MiB / 23028MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
import boto3
def download_files_from_bucket(file,bucket):
'''
this function is for downloading the files from the bucket to the local instance
'''
bucket_name = bucket
file_key = file
local_file_path = file
s3 = boto3.client('s3')
s3.download_file(bucket_name, file_key, local_file_path)
print(f"File downloaded to {local_file_path}")
download_files_from_bucket('stanford-car-dataset-by-classes-folder.zip','pgp-capstone-project')
File downloaded to stanford-car-dataset-by-classes-folder.zip
zip_file_path = 'stanford-car-dataset-by-classes-folder.zip'
!unzip -oq stanford-car-dataset-by-classes-folder.zip
Computer vision can be used to automate supervision and generate action appropriate action trigger if the event is predicted from the image of interest. For example a car moving on the road can be easily identified by a camera as make of the car, type, colour, number plates etc.
Design a DL based car identification model.
The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.
Data description:
‣ Train Images: Consists of real images of cars as per the make and year of the car.
‣ Test Images: Consists of real images of cars as per the make and year of the car.
‣ Train Annotation: Consists of bounding box region for training images.
‣ Test Annotation: Consists of bounding box region for testing images.
import os
import zipfile
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt #for visualization
import matplotlib.patches as patches
import seaborn as sns
from PIL import Image # For image loading and manipulation
from pathlib import Path
import cv2
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.utils.class_weight import compute_class_weight
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, GlobalAveragePooling2D, BatchNormalization
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.utils import class_weight
from tensorflow.keras.applications.resnet50 import preprocess_input as resnet_preprocess
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mobilenet_preprocess
from keras.applications.inception_v3 import preprocess_input as googlenet_preprocess
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications import ResNet50
Matplotlib is building the font cache; this may take a moment.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True) # Allow dynamic allocation
except RuntimeError as e:
print(e)
4A. Data Handling - Import Data
train_annotations_df = pd.read_csv( "anno_train.csv",header=None)
test_annotations_df = pd.read_csv( "anno_test.csv",header=None)
image_class_df = pd.read_csv( "names.csv",header=None)
train_annotations_df.rename(columns={0:"image_name",1:"xmin",2:"ymin",3:'xmax',4:'ymax',5:'image_class'},inplace=True)
test_annotations_df.rename(columns={0:"image_name",1:"xmin",2:"ymin",3:'xmax',4:'ymax',5:'image_class'},inplace=True)
image_class_df.rename(columns={0:'image_name'},inplace=True)
train_annotations_df.head(5)
| image_name | xmin | ymin | xmax | ymax | image_class | |
|---|---|---|---|---|---|---|
| 0 | 00001.jpg | 39 | 116 | 569 | 375 | 14 |
| 1 | 00002.jpg | 36 | 116 | 868 | 587 | 3 |
| 2 | 00003.jpg | 85 | 109 | 601 | 381 | 91 |
| 3 | 00004.jpg | 621 | 393 | 1484 | 1096 | 134 |
| 4 | 00005.jpg | 14 | 36 | 133 | 99 | 106 |
test_annotations_df.head()
| image_name | xmin | ymin | xmax | ymax | image_class | |
|---|---|---|---|---|---|---|
| 0 | 00001.jpg | 30 | 52 | 246 | 147 | 181 |
| 1 | 00002.jpg | 100 | 19 | 576 | 203 | 103 |
| 2 | 00003.jpg | 51 | 105 | 968 | 659 | 145 |
| 3 | 00004.jpg | 67 | 84 | 581 | 407 | 187 |
| 4 | 00005.jpg | 140 | 151 | 593 | 339 | 185 |
# for images
base_dir = Path(r"./car_data") #replace the directory accordingly
train_images_path = base_dir / "car_data" / "train"
test_images_path = base_dir / "car_data" / "test"
train_images_path = Path(train_images_path).resolve()
test_images_path = Path(test_images_path).resolve()
print(f"train image path is {train_images_path}")
print(f"test image path is {test_images_path}")
train image path is /home/ec2-user/SageMaker/car_data/car_data/train test image path is /home/ec2-user/SageMaker/car_data/car_data/test
4B. Data Handling - Map Images w.r.t Classes
#Train Images class mapping
#Folder where multiple train images are stored
train_class_folders = [f.path for f in os.scandir(train_images_path) if f.is_dir()]
train_image_classes = {} # Dictionary to store training image: class mapping
train_images_path = list(train_images_path.rglob("*.jpg"))
# Create a dictionary mapping image filenames to class names (parent folder)
train_image_classes = {img_path.name: img_path.parent.name for img_path in train_images_path}
# Define columns for the Training DataFrame
columns_training = ['Image_Path', 'labels']
# Create an empty DataFrame
df_training = pd.DataFrame(columns=columns_training)
df_training = pd.DataFrame(train_images_path, columns=["Image_Path"])
df_training["labels"] = df_training["Image_Path"].apply(lambda x: Path(x).parent.name)
df_training["Image_Path"] = df_training["Image_Path"].apply(lambda x: str(Path(x).resolve()))
df_training["Image_Path"] = df_training["Image_Path"].astype(str)
print(df_training.head(10))
# --- Print a few mappings to verify ---
print("Sample Training Image to Class Mappings:")
count = 0
for img_name, class_label in list(train_image_classes.items())[:5]:
print(f"{img_name}: {class_label}")
Image_Path labels 0 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 1 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 2 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 3 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 4 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 5 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 6 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 7 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 8 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 9 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 Sample Training Image to Class Mappings: 05829.jpg: Infiniti QX56 SUV 2011 04532.jpg: Infiniti QX56 SUV 2011 04524.jpg: Infiniti QX56 SUV 2011 04856.jpg: Infiniti QX56 SUV 2011 02413.jpg: Infiniti QX56 SUV 2011
#Test Images class mapping
#Folder where multiple test images are stored
test_class_folders = [f.path for f in os.scandir(test_images_path) if f.is_dir()]
test_image_classes = {} # Dictionary to store testing image: class mapping
test_images_path_root = test_images_path.resolve()
test_images_path_list = list(test_images_path_root.rglob("*.jpg"))
# Create a dictionary mapping image filenames to class names (parent folder)
test_image_classes = {img_path.name: img_path.parent.name for img_path in test_images_path_list}
# Define columns for the Testing DataFrame
columns_testing = ['Image_Path', 'labels']
# Create an empty DataFrame
df_testing = pd.DataFrame(columns=columns_testing)
df_testing = pd.DataFrame(test_images_path_list, columns=["Image_Path"])
df_testing["labels"] = df_testing["Image_Path"].apply(lambda x: Path(x).parent.name)
df_testing["Image_Path"] = df_testing["Image_Path"].apply(lambda x: str(Path(x).resolve()))
df_testing["Image_Path"] = df_testing["Image_Path"].astype(str)
print(df_testing.head(10))
print("Sample Testing Image to Class Mappings:")
count = 0
for img_name, class_label in list(test_image_classes.items())[:5]:
print(f"{img_name}: {class_label}")
Image_Path labels 0 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 1 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 2 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 3 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 4 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 5 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 6 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 7 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 8 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 9 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 Sample Testing Image to Class Mappings: 01068.jpg: Infiniti QX56 SUV 2011 02434.jpg: Infiniti QX56 SUV 2011 02499.jpg: Infiniti QX56 SUV 2011 04803.jpg: Infiniti QX56 SUV 2011 00478.jpg: Infiniti QX56 SUV 2011
4C. Data Handling - Map Images w.r.t Annotations
train_annotations_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8144 entries, 0 to 8143 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 image_name 8144 non-null object 1 xmin 8144 non-null int64 2 ymin 8144 non-null int64 3 xmax 8144 non-null int64 4 ymax 8144 non-null int64 5 image_class 8144 non-null int64 dtypes: int64(5), object(1) memory usage: 381.9+ KB
# ********Definition of the method ********************************
def map_images_to_bboxes(annotations_file):
image_bboxes = {}
try:
for index, row in annotations_file.iterrows():
image_name = row['image_name']
x_min = row['xmin']
y_min = row['ymin']
x_max = row['xmax']
y_max = row['ymax']
image_class = row['image_class']
image_bboxes[image_name] = (x_min, y_min, x_max, y_max) # Store bbox as tuple
except FileNotFoundError:
print(f"Error: Annotation file not found: {annotations_file}")
except KeyError as e:
print(f"Error: Column '{e}' not found in CSV file. Check your CSV column names.")
print("Expected columns (example): filename, xmin, ymin, xmax, ymax") # Example expected columns
return image_bboxes
#Train images boundry box mapping
train_image_bboxes = map_images_to_bboxes(train_annotations_df)
# --- Print a few mappings to verify for Training images ---
print("\nSample Training Image to Bounding Box Mappings (DF):")
count = 0
for img_name, bbox in train_image_bboxes.items():
print(f"{img_name}: {bbox}")
count += 1
if count > 5: break
Sample Training Image to Bounding Box Mappings (DF): 00001.jpg: (39, 116, 569, 375) 00002.jpg: (36, 116, 868, 587) 00003.jpg: (85, 109, 601, 381) 00004.jpg: (621, 393, 1484, 1096) 00005.jpg: (14, 36, 133, 99) 00006.jpg: (259, 289, 515, 416)
#Test images boundry box mapping
test_image_bboxes = map_images_to_bboxes(test_annotations_df)
# --- Print a few mappings to verify testing images---
print("\nSample Testing Image to Bounding Box Mappings (DF):")
count = 0
for img_name, bbox in test_image_bboxes.items():
print(f"{img_name}: {bbox}")
count += 1
if count > 5: break
Sample Testing Image to Bounding Box Mappings (DF): 00001.jpg: (30, 52, 246, 147) 00002.jpg: (100, 19, 576, 203) 00003.jpg: (51, 105, 968, 659) 00004.jpg: (67, 84, 581, 407) 00005.jpg: (140, 151, 593, 339) 00006.jpg: (20, 77, 420, 301)
# Display images with bounding boxes
def display_image_with_bbox(image_path, annotation):
# Load image
img = Image.open(image_path)
# Create plot
fig, ax = plt.subplots(1)
ax.imshow(img)
# Draw bounding box
x_min = row['xmin']
y_min = row['ymin']
x_max = row['xmax']
y_max = row['ymax']
image_class = row['image_class']
bbox = annotation['bbox']
rect = patches.Rectangle(
(x_min, y_min), # (x_min, y_min) - (bbox[0], bbox[1])
(x_max - x_min), # width (x_max - x_min) - bbox[2] - bbox[0]
(y_max - y_min), # height (y_max - y_min) -- bbox[3] - bbox[1]
linewidth=2,
edgecolor='r',
facecolor='none'
)
ax.add_patch(rect)
# Add class label
plt.text(
bbox[0], bbox[1] - 10, # Position of the label
annotation['image_class'],
color='red',
fontsize=12,
backgroundcolor='white'
)
plt.axis('off')
plt.show()
# Display bounding box for train images
print("For Training Images")
displayed_image_count = 0 # Initialize a counter to track displayed images
image_paths_details_training=[]
images_paths_details_testing=[]
for index, row in train_annotations_df.iterrows():
if displayed_image_count >= 5: # Check if we've already displayed two images
break # If yes, exit the loop
image_name = str(row['image_name']).strip()
image_path = None # Initialize image_path to None
for class_folder in train_class_folders:
potential_image_path = os.path.join(class_folder, image_name)
if os.path.exists(potential_image_path):
image_path = potential_image_path
image_paths_details_training.append(potential_image_path)
break # Image found, no need to check other class folders
if image_path: # If image_path is found (not None)
annotation = {
'bbox': [row['xmin'], row['ymin'], row['xmax'], row['ymax']],
'image_class' : row['image_class']
}
display_image_with_bbox(image_path, annotation)
displayed_image_count += 1 # Increment the counter
print(f"Displayed {displayed_image_count} training images with bounding boxes.")
For Training Images
Displayed 5 training images with bounding boxes.
# Display bounding box for test images
print("For Testing Images")
displayed_image_count_test = 0 # Initialize a counter to track displayed images
for index, row in test_annotations_df.iterrows(): # Use test_annotations_df DataFrame
if displayed_image_count_test >= 5: # Check if we've already displayed two images (adjust number here if you want 5 or more)
break # If yes, exit the loop
image_name_test = str(row['image_name']).strip()
image_path_test = None # Initialize image_path_test to None
for class_folder in test_class_folders: # Use test_class_folders
potential_image_path_test = os.path.join(class_folder, image_name_test)
if os.path.exists(potential_image_path_test):
image_path_test = potential_image_path_test # Assigned to image_path_test
images_paths_details_testing.append(potential_image_path)
break # Image found, no need to check other class folders
if image_path_test: # If image_path_test is found (not None)
annotation_test = {
'bbox': [row['xmin'], row['ymin'], row['xmax'], row['ymax']],
'image_class' : row['image_class'] # Assuming 'Image class' column also exists in test_annotations_df (verify!)
}
display_image_with_bbox(image_path_test, annotation_test) # Changed here
displayed_image_count_test += 1 # Increment the counter
print(f"Displayed {displayed_image_count_test} test images with bounding boxes.")
For Testing Images
Displayed 5 test images with bounding boxes.
The Models designed are:
def preprocess_image(image_path, target_size=(224, 224)):
"""
Load and preprocess an image for CNN input.
"""
# Check if the image file exists
if not os.path.exists(image_path):
print(f"Warning: Image file not found: {image_path}")
return None # Or handle the missing image in a way that makes sense for your application
image = cv2.imread(image_path) # Load image
# Check if image loading was successful
if image is None:
print(f"Warning: Failed to load image: {image_path}")
return None # Or handle the loading error as needed
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert to RGB
image = cv2.resize(image, target_size) # Resize to target size
image = image / 255.0 # Normalize pixel values to [0, 1]
return image
def custom_generator(df, batch_size, target_size):
"""
Custom generator for images and labels.
"""
num_samples = len(df)
while True:
for offset in range(0, num_samples, batch_size):
batch_samples = df.iloc[offset:offset + batch_size]
images = []
labels = []
for _, row in batch_samples.iterrows():
image = preprocess_image(row['Image_Path'], target_size)
label = row['label_categorical']
images.append(image)
labels.append(label)
X = np.array(images, dtype=np.float32)
y = np.array(labels, dtype=np.float32)
yield X, y
# Apply preprocessing to all images
df_testing['image'] = df_testing['Image_Path'].apply(preprocess_image)
df_training['image'] = df_training['Image_Path'].apply(preprocess_image)
# Check for and handle None values in the 'image' column
df_testing = df_testing.dropna(subset=['image']) # Remove rows with None in 'image'
df_training = df_training.dropna(subset=['image']) # Remove rows with None in 'image'
# Encode labels
label_encoder = LabelEncoder()
df_testing['labels_encoded'] = label_encoder.fit_transform(df_testing['labels'])
df_training['labels_encoded'] = label_encoder.fit_transform(df_training['labels'])
# Convert labels to categorical (one-hot encoding)
df_testing['label_categorical'] = df_testing['labels_encoded'].apply(lambda x: to_categorical(x, num_classes=len(test_class_folders)))
df_training['label_categorical'] = df_training['labels_encoded'].apply(lambda x: to_categorical(x, num_classes=len(test_class_folders)))
# Split df_training into training and validation sets
df_train, df_val = train_test_split(df_training, test_size=0.2, random_state=42)
# Create generators
#batch_size = 32
batch_size = 16
train_generator = custom_generator(df_train, batch_size, target_size=(224, 224))
val_generator = custom_generator(df_val, batch_size, target_size=(224, 224)) # Use df_val for validation
# Test generator remains the same
test_generator = custom_generator(df_testing, batch_size, target_size=(224, 224))
# Check training generator
X_batch, y_batch = next(train_generator)
print("Training batch shape:", X_batch.shape, y_batch.shape)
# Check validation generator
X_batch, y_batch = next(val_generator)
print("Validation batch shape:", X_batch.shape, y_batch.shape)
Training batch shape: (16, 224, 224, 3) (16, 196) Validation batch shape: (16, 224, 224, 3) (16, 196)
#Generate classification report from a Keras/TensorFlow model using GPU-accelerated prediction.
#Assumes df_val['image'] contains pre-loaded images as np.arrays and df_val['label_categorical'] is one-hot encoded.
#returns y_val_pred, y_val_true: Predicted and true label indices
def generate_classification_report_tf_model(
model, #model
df_val, #val data frame
label_encoder, #label encoder
preprocess_fn, #preprocess_input
batch_size=32,
report_name="model_report.csv"
):
# Convert image and label columns to NumPy arrays
images = [img for img in df_val['image'] if img is not None]
labels = [label for label in df_val['label_categorical'] if label is not None]
images = np.stack(df_val['image'].values).astype(np.float32)
labels = np.stack(labels)
# Build tf.data.Dataset
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)
# Predict
preds = model.predict(dataset, verbose=1)
y_val_pred = np.argmax(preds, axis=1)
y_val_true = np.argmax(labels, axis=1)
# Evaluation
acc = accuracy_score(y_val_true, y_val_pred)
print(f"Model Accuracy: {acc:.4f}\n")
print("Classification Report:")
# Save as CSV
report = classification_report(
y_val_true, y_val_pred,
target_names=label_encoder.classes_,
output_dict=True,
zero_division=1
)
df_report = pd.DataFrame(report).transpose()
df_report.loc["overall_accuracy"] = [acc, None, None, None]
df_report.to_csv(report_name)
print(f"Report saved as: {report_name}")
# Print only the average metrics
print(f"Model Accuracy: {acc:.4f}")
print("Average Summary Metrics:")
print(df_report.loc[["macro avg", "weighted avg", "overall_accuracy"]][["precision", "recall", "f1-score"]])
return y_val_pred, y_val_true, df_report
def plot_training_history(history):
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
# Plot accuracy
ax1.plot(history.history['accuracy'], label='Training Accuracy')
ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
ax1.set_title('Model Accuracy')
ax1.set_xlabel('Epochs')
ax1.set_ylabel('Accuracy')
ax1.legend()
# Plot loss
ax2.plot(history.history['loss'], label='Training Loss')
ax2.plot(history.history['val_loss'], label='Validation Loss')
ax2.set_title('Model Loss')
ax2.set_xlabel('Epochs')
ax2.set_ylabel('Loss')
ax2.legend()
plt.show()
6A. MobileNetV2
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3),classes=196) # Use 128x128 for speed
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5 9406464/9406464 [==============================] - 0s 0us/step
# Freeze all but last 4 layers for efficient training
for layer in base_model.layers[:-4]:
layer.trainable = False
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x) # Reduces parameters
x = BatchNormalization()(x) # Stabilizes training
x = Dense(128, activation='relu')(x)
x = Dropout(0.3)(x) # Dropout for regularization
predictions = Dense(len(label_encoder.classes_), activation='softmax')(x) # Output layer
#Split 80-20 of train images
df_train_mobilenet, df_val_mobilenet = train_test_split(df_training, test_size=0.2, random_state=42)
mobilenet_batch_size=16
#df_train_mobilenet_gen = custom_generator(df_train_mobilenet,mobilenet_batch_size,target_size=(128,128))
#df_val_mobilenet_gen = custom_generator(df_val_mobilenet,mobilenet_batch_size,target_size=(128,128))
df_train_mobilenet_gen = custom_generator(df_train_mobilenet,mobilenet_batch_size,target_size=(224,224))
df_val_mobilenet_gen = custom_generator(df_val_mobilenet,mobilenet_batch_size,target_size=(224,224))
# Create the model
mobilenet_model = Model(inputs=base_model.input, outputs=predictions)
# Compile the model
mobilenet_model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
mobilenet_model.summary()
# Define steps per epoch
steps_per_epoch = np.ceil(len(df_train_mobilenet) / mobilenet_batch_size).astype(int)
validation_steps = np.ceil(len(df_val_mobilenet) / mobilenet_batch_size).astype(int)
y_true = np.array(df_train_mobilenet['labels_encoded'].tolist())
# Compute class weights based on actual class distribution
class_weights = compute_class_weight('balanced', classes=np.unique(y_true), y=y_true)
class_weight_dict = dict(enumerate(class_weights))
#predicting
history_mobilenet = mobilenet_model.fit(
df_train_mobilenet_gen,
steps_per_epoch=steps_per_epoch,
validation_data=df_val_mobilenet_gen,
validation_steps=validation_steps,
epochs=10 # Reduce epochs to speed up training
)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0 []
Conv1 (Conv2D) (None, 112, 112, 32) 864 ['input_1[0][0]']
bn_Conv1 (BatchNormalizati (None, 112, 112, 32) 128 ['Conv1[0][0]']
on)
Conv1_relu (ReLU) (None, 112, 112, 32) 0 ['bn_Conv1[0][0]']
expanded_conv_depthwise (D (None, 112, 112, 32) 288 ['Conv1_relu[0][0]']
epthwiseConv2D)
expanded_conv_depthwise_BN (None, 112, 112, 32) 128 ['expanded_conv_depthwise[0][0
(BatchNormalization) ]']
expanded_conv_depthwise_re (None, 112, 112, 32) 0 ['expanded_conv_depthwise_BN[0
lu (ReLU) ][0]']
expanded_conv_project (Con (None, 112, 112, 16) 512 ['expanded_conv_depthwise_relu
v2D) [0][0]']
expanded_conv_project_BN ( (None, 112, 112, 16) 64 ['expanded_conv_project[0][0]'
BatchNormalization) ]
block_1_expand (Conv2D) (None, 112, 112, 96) 1536 ['expanded_conv_project_BN[0][
0]']
block_1_expand_BN (BatchNo (None, 112, 112, 96) 384 ['block_1_expand[0][0]']
rmalization)
block_1_expand_relu (ReLU) (None, 112, 112, 96) 0 ['block_1_expand_BN[0][0]']
block_1_pad (ZeroPadding2D (None, 113, 113, 96) 0 ['block_1_expand_relu[0][0]']
)
block_1_depthwise (Depthwi (None, 56, 56, 96) 864 ['block_1_pad[0][0]']
seConv2D)
block_1_depthwise_BN (Batc (None, 56, 56, 96) 384 ['block_1_depthwise[0][0]']
hNormalization)
block_1_depthwise_relu (Re (None, 56, 56, 96) 0 ['block_1_depthwise_BN[0][0]']
LU)
block_1_project (Conv2D) (None, 56, 56, 24) 2304 ['block_1_depthwise_relu[0][0]
']
block_1_project_BN (BatchN (None, 56, 56, 24) 96 ['block_1_project[0][0]']
ormalization)
block_2_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_1_project_BN[0][0]']
block_2_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_2_expand[0][0]']
rmalization)
block_2_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_2_expand_BN[0][0]']
block_2_depthwise (Depthwi (None, 56, 56, 144) 1296 ['block_2_expand_relu[0][0]']
seConv2D)
block_2_depthwise_BN (Batc (None, 56, 56, 144) 576 ['block_2_depthwise[0][0]']
hNormalization)
block_2_depthwise_relu (Re (None, 56, 56, 144) 0 ['block_2_depthwise_BN[0][0]']
LU)
block_2_project (Conv2D) (None, 56, 56, 24) 3456 ['block_2_depthwise_relu[0][0]
']
block_2_project_BN (BatchN (None, 56, 56, 24) 96 ['block_2_project[0][0]']
ormalization)
block_2_add (Add) (None, 56, 56, 24) 0 ['block_1_project_BN[0][0]',
'block_2_project_BN[0][0]']
block_3_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_2_add[0][0]']
block_3_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_3_expand[0][0]']
rmalization)
block_3_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_3_expand_BN[0][0]']
block_3_pad (ZeroPadding2D (None, 57, 57, 144) 0 ['block_3_expand_relu[0][0]']
)
block_3_depthwise (Depthwi (None, 28, 28, 144) 1296 ['block_3_pad[0][0]']
seConv2D)
block_3_depthwise_BN (Batc (None, 28, 28, 144) 576 ['block_3_depthwise[0][0]']
hNormalization)
block_3_depthwise_relu (Re (None, 28, 28, 144) 0 ['block_3_depthwise_BN[0][0]']
LU)
block_3_project (Conv2D) (None, 28, 28, 32) 4608 ['block_3_depthwise_relu[0][0]
']
block_3_project_BN (BatchN (None, 28, 28, 32) 128 ['block_3_project[0][0]']
ormalization)
block_4_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_3_project_BN[0][0]']
block_4_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_4_expand[0][0]']
rmalization)
block_4_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_4_expand_BN[0][0]']
block_4_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_4_expand_relu[0][0]']
seConv2D)
block_4_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_4_depthwise[0][0]']
hNormalization)
block_4_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_4_depthwise_BN[0][0]']
LU)
block_4_project (Conv2D) (None, 28, 28, 32) 6144 ['block_4_depthwise_relu[0][0]
']
block_4_project_BN (BatchN (None, 28, 28, 32) 128 ['block_4_project[0][0]']
ormalization)
block_4_add (Add) (None, 28, 28, 32) 0 ['block_3_project_BN[0][0]',
'block_4_project_BN[0][0]']
block_5_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_4_add[0][0]']
block_5_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_5_expand[0][0]']
rmalization)
block_5_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_5_expand_BN[0][0]']
block_5_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_5_expand_relu[0][0]']
seConv2D)
block_5_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_5_depthwise[0][0]']
hNormalization)
block_5_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_5_depthwise_BN[0][0]']
LU)
block_5_project (Conv2D) (None, 28, 28, 32) 6144 ['block_5_depthwise_relu[0][0]
']
block_5_project_BN (BatchN (None, 28, 28, 32) 128 ['block_5_project[0][0]']
ormalization)
block_5_add (Add) (None, 28, 28, 32) 0 ['block_4_add[0][0]',
'block_5_project_BN[0][0]']
block_6_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_5_add[0][0]']
block_6_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_6_expand[0][0]']
rmalization)
block_6_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_6_expand_BN[0][0]']
block_6_pad (ZeroPadding2D (None, 29, 29, 192) 0 ['block_6_expand_relu[0][0]']
)
block_6_depthwise (Depthwi (None, 14, 14, 192) 1728 ['block_6_pad[0][0]']
seConv2D)
block_6_depthwise_BN (Batc (None, 14, 14, 192) 768 ['block_6_depthwise[0][0]']
hNormalization)
block_6_depthwise_relu (Re (None, 14, 14, 192) 0 ['block_6_depthwise_BN[0][0]']
LU)
block_6_project (Conv2D) (None, 14, 14, 64) 12288 ['block_6_depthwise_relu[0][0]
']
block_6_project_BN (BatchN (None, 14, 14, 64) 256 ['block_6_project[0][0]']
ormalization)
block_7_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_6_project_BN[0][0]']
block_7_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_7_expand[0][0]']
rmalization)
block_7_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_7_expand_BN[0][0]']
block_7_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_7_expand_relu[0][0]']
seConv2D)
block_7_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_7_depthwise[0][0]']
hNormalization)
block_7_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_7_depthwise_BN[0][0]']
LU)
block_7_project (Conv2D) (None, 14, 14, 64) 24576 ['block_7_depthwise_relu[0][0]
']
block_7_project_BN (BatchN (None, 14, 14, 64) 256 ['block_7_project[0][0]']
ormalization)
block_7_add (Add) (None, 14, 14, 64) 0 ['block_6_project_BN[0][0]',
'block_7_project_BN[0][0]']
block_8_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_7_add[0][0]']
block_8_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_8_expand[0][0]']
rmalization)
block_8_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_8_expand_BN[0][0]']
block_8_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_8_expand_relu[0][0]']
seConv2D)
block_8_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_8_depthwise[0][0]']
hNormalization)
block_8_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_8_depthwise_BN[0][0]']
LU)
block_8_project (Conv2D) (None, 14, 14, 64) 24576 ['block_8_depthwise_relu[0][0]
']
block_8_project_BN (BatchN (None, 14, 14, 64) 256 ['block_8_project[0][0]']
ormalization)
block_8_add (Add) (None, 14, 14, 64) 0 ['block_7_add[0][0]',
'block_8_project_BN[0][0]']
block_9_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_8_add[0][0]']
block_9_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_9_expand[0][0]']
rmalization)
block_9_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_9_expand_BN[0][0]']
block_9_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_9_expand_relu[0][0]']
seConv2D)
block_9_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_9_depthwise[0][0]']
hNormalization)
block_9_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_9_depthwise_BN[0][0]']
LU)
block_9_project (Conv2D) (None, 14, 14, 64) 24576 ['block_9_depthwise_relu[0][0]
']
block_9_project_BN (BatchN (None, 14, 14, 64) 256 ['block_9_project[0][0]']
ormalization)
block_9_add (Add) (None, 14, 14, 64) 0 ['block_8_add[0][0]',
'block_9_project_BN[0][0]']
block_10_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_9_add[0][0]']
block_10_expand_BN (BatchN (None, 14, 14, 384) 1536 ['block_10_expand[0][0]']
ormalization)
block_10_expand_relu (ReLU (None, 14, 14, 384) 0 ['block_10_expand_BN[0][0]']
)
block_10_depthwise (Depthw (None, 14, 14, 384) 3456 ['block_10_expand_relu[0][0]']
iseConv2D)
block_10_depthwise_BN (Bat (None, 14, 14, 384) 1536 ['block_10_depthwise[0][0]']
chNormalization)
block_10_depthwise_relu (R (None, 14, 14, 384) 0 ['block_10_depthwise_BN[0][0]'
eLU) ]
block_10_project (Conv2D) (None, 14, 14, 96) 36864 ['block_10_depthwise_relu[0][0
]']
block_10_project_BN (Batch (None, 14, 14, 96) 384 ['block_10_project[0][0]']
Normalization)
block_11_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_10_project_BN[0][0]']
block_11_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_11_expand[0][0]']
ormalization)
block_11_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_11_expand_BN[0][0]']
)
block_11_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_11_expand_relu[0][0]']
iseConv2D)
block_11_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_11_depthwise[0][0]']
chNormalization)
block_11_depthwise_relu (R (None, 14, 14, 576) 0 ['block_11_depthwise_BN[0][0]'
eLU) ]
block_11_project (Conv2D) (None, 14, 14, 96) 55296 ['block_11_depthwise_relu[0][0
]']
block_11_project_BN (Batch (None, 14, 14, 96) 384 ['block_11_project[0][0]']
Normalization)
block_11_add (Add) (None, 14, 14, 96) 0 ['block_10_project_BN[0][0]',
'block_11_project_BN[0][0]']
block_12_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_11_add[0][0]']
block_12_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_12_expand[0][0]']
ormalization)
block_12_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_12_expand_BN[0][0]']
)
block_12_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_12_expand_relu[0][0]']
iseConv2D)
block_12_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_12_depthwise[0][0]']
chNormalization)
block_12_depthwise_relu (R (None, 14, 14, 576) 0 ['block_12_depthwise_BN[0][0]'
eLU) ]
block_12_project (Conv2D) (None, 14, 14, 96) 55296 ['block_12_depthwise_relu[0][0
]']
block_12_project_BN (Batch (None, 14, 14, 96) 384 ['block_12_project[0][0]']
Normalization)
block_12_add (Add) (None, 14, 14, 96) 0 ['block_11_add[0][0]',
'block_12_project_BN[0][0]']
block_13_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_12_add[0][0]']
block_13_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_13_expand[0][0]']
ormalization)
block_13_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_13_expand_BN[0][0]']
)
block_13_pad (ZeroPadding2 (None, 15, 15, 576) 0 ['block_13_expand_relu[0][0]']
D)
block_13_depthwise (Depthw (None, 7, 7, 576) 5184 ['block_13_pad[0][0]']
iseConv2D)
block_13_depthwise_BN (Bat (None, 7, 7, 576) 2304 ['block_13_depthwise[0][0]']
chNormalization)
block_13_depthwise_relu (R (None, 7, 7, 576) 0 ['block_13_depthwise_BN[0][0]'
eLU) ]
block_13_project (Conv2D) (None, 7, 7, 160) 92160 ['block_13_depthwise_relu[0][0
]']
block_13_project_BN (Batch (None, 7, 7, 160) 640 ['block_13_project[0][0]']
Normalization)
block_14_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_13_project_BN[0][0]']
block_14_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_14_expand[0][0]']
ormalization)
block_14_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_14_expand_BN[0][0]']
)
block_14_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_14_expand_relu[0][0]']
iseConv2D)
block_14_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_14_depthwise[0][0]']
chNormalization)
block_14_depthwise_relu (R (None, 7, 7, 960) 0 ['block_14_depthwise_BN[0][0]'
eLU) ]
block_14_project (Conv2D) (None, 7, 7, 160) 153600 ['block_14_depthwise_relu[0][0
]']
block_14_project_BN (Batch (None, 7, 7, 160) 640 ['block_14_project[0][0]']
Normalization)
block_14_add (Add) (None, 7, 7, 160) 0 ['block_13_project_BN[0][0]',
'block_14_project_BN[0][0]']
block_15_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_14_add[0][0]']
block_15_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_15_expand[0][0]']
ormalization)
block_15_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_15_expand_BN[0][0]']
)
block_15_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_15_expand_relu[0][0]']
iseConv2D)
block_15_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_15_depthwise[0][0]']
chNormalization)
block_15_depthwise_relu (R (None, 7, 7, 960) 0 ['block_15_depthwise_BN[0][0]'
eLU) ]
block_15_project (Conv2D) (None, 7, 7, 160) 153600 ['block_15_depthwise_relu[0][0
]']
block_15_project_BN (Batch (None, 7, 7, 160) 640 ['block_15_project[0][0]']
Normalization)
block_15_add (Add) (None, 7, 7, 160) 0 ['block_14_add[0][0]',
'block_15_project_BN[0][0]']
block_16_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_15_add[0][0]']
block_16_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_16_expand[0][0]']
ormalization)
block_16_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_16_expand_BN[0][0]']
)
block_16_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_16_expand_relu[0][0]']
iseConv2D)
block_16_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_16_depthwise[0][0]']
chNormalization)
block_16_depthwise_relu (R (None, 7, 7, 960) 0 ['block_16_depthwise_BN[0][0]'
eLU) ]
block_16_project (Conv2D) (None, 7, 7, 320) 307200 ['block_16_depthwise_relu[0][0
]']
block_16_project_BN (Batch (None, 7, 7, 320) 1280 ['block_16_project[0][0]']
Normalization)
Conv_1 (Conv2D) (None, 7, 7, 1280) 409600 ['block_16_project_BN[0][0]']
Conv_1_bn (BatchNormalizat (None, 7, 7, 1280) 5120 ['Conv_1[0][0]']
ion)
out_relu (ReLU) (None, 7, 7, 1280) 0 ['Conv_1_bn[0][0]']
global_average_pooling2d ( (None, 1280) 0 ['out_relu[0][0]']
GlobalAveragePooling2D)
batch_normalization (Batch (None, 1280) 5120 ['global_average_pooling2d[0][
Normalization) 0]']
dense (Dense) (None, 128) 163968 ['batch_normalization[0][0]']
dropout (Dropout) (None, 128) 0 ['dense[0][0]']
dense_1 (Dense) (None, 196) 25284 ['dropout[0][0]']
==================================================================================================
Total params: 2452356 (9.35 MB)
Trainable params: 604612 (2.31 MB)
Non-trainable params: 1847744 (7.05 MB)
__________________________________________________________________________________________________
Epoch 1/10
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1743223099.903668 9687 service.cc:145] XLA service 0x7f7744393810 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: I0000 00:00:1743223099.903708 9687 service.cc:153] StreamExecutor device (0): NVIDIA A10G, Compute Capability 8.6 I0000 00:00:1743223100.579529 9687 device_compiler.h:188] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
408/408 [==============================] - 104s 82ms/step - loss: 5.2777 - accuracy: 0.0227 - val_loss: 4.7975 - val_accuracy: 0.0552 Epoch 2/10 408/408 [==============================] - 27s 65ms/step - loss: 4.2979 - accuracy: 0.1101 - val_loss: 4.2909 - val_accuracy: 0.1130 Epoch 3/10 408/408 [==============================] - 26s 65ms/step - loss: 3.6382 - accuracy: 0.2106 - val_loss: 3.9245 - val_accuracy: 0.1627 Epoch 4/10 408/408 [==============================] - 26s 64ms/step - loss: 3.1051 - accuracy: 0.3119 - val_loss: 3.6571 - val_accuracy: 0.1995 Epoch 5/10 408/408 [==============================] - 25s 62ms/step - loss: 2.6730 - accuracy: 0.4049 - val_loss: 3.4491 - val_accuracy: 0.2314 Epoch 6/10 408/408 [==============================] - 25s 61ms/step - loss: 2.2819 - accuracy: 0.4973 - val_loss: 3.2925 - val_accuracy: 0.2480 Epoch 7/10 408/408 [==============================] - 23s 57ms/step - loss: 1.9640 - accuracy: 0.5695 - val_loss: 3.1616 - val_accuracy: 0.2701 Epoch 8/10 408/408 [==============================] - 21s 52ms/step - loss: 1.6769 - accuracy: 0.6431 - val_loss: 3.0760 - val_accuracy: 0.2897 Epoch 9/10 408/408 [==============================] - 21s 52ms/step - loss: 1.4317 - accuracy: 0.7002 - val_loss: 2.9941 - val_accuracy: 0.3082 Epoch 10/10 408/408 [==============================] - 21s 52ms/step - loss: 1.2151 - accuracy: 0.7527 - val_loss: 2.9341 - val_accuracy: 0.3076
#display model accurance and loss
plot_training_history(history_mobilenet)
y_pred, y_true,df_mobilenet_classification_report = generate_classification_report_tf_model(
model=mobilenet_model,
df_val=df_val_mobilenet,
label_encoder=label_encoder,
preprocess_fn=mobilenet_preprocess,
batch_size=32,
report_name="mobilenet_classification_report.csv"
)
51/51 [==============================] - 5s 20ms/step
Model Accuracy: 0.3069
Classification Report:
Report saved as: mobilenet_classification_report.csv
Model Accuracy: 0.3069
Average Summary Metrics:
precision recall f1-score
macro avg 0.320829 0.309102 0.289196
weighted avg 0.349378 0.306937 0.301969
overall_accuracy 0.306937 NaN NaN
Displaying only top 10 class names
df_support = df_mobilenet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
mobilenet_cm = confusion_matrix(y_true, y_pred)
mobilenet_cm_top10 = mobilenet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(mobilenet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("Mobile Net Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
Next Steps
6B. GoogleNet
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Add custom layers on top of the base model
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(196, activation='softmax')(x)
# Define the complete model
googlenet_model = Model(inputs=base_model.input, outputs=predictions)
# Freeze the layers of the base model
for layer in base_model.layers:
layer.trainable = False
# Compile the model
googlenet_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5 87910968/87910968 [==============================] - 7s 0us/step
googlenet_model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 224, 224, 3)] 0 []
conv2d (Conv2D) (None, 111, 111, 32) 864 ['input_2[0][0]']
batch_normalization_1 (Bat (None, 111, 111, 32) 96 ['conv2d[0][0]']
chNormalization)
activation (Activation) (None, 111, 111, 32) 0 ['batch_normalization_1[0][0]'
]
conv2d_1 (Conv2D) (None, 109, 109, 32) 9216 ['activation[0][0]']
batch_normalization_2 (Bat (None, 109, 109, 32) 96 ['conv2d_1[0][0]']
chNormalization)
activation_1 (Activation) (None, 109, 109, 32) 0 ['batch_normalization_2[0][0]'
]
conv2d_2 (Conv2D) (None, 109, 109, 64) 18432 ['activation_1[0][0]']
batch_normalization_3 (Bat (None, 109, 109, 64) 192 ['conv2d_2[0][0]']
chNormalization)
activation_2 (Activation) (None, 109, 109, 64) 0 ['batch_normalization_3[0][0]'
]
max_pooling2d (MaxPooling2 (None, 54, 54, 64) 0 ['activation_2[0][0]']
D)
conv2d_3 (Conv2D) (None, 54, 54, 80) 5120 ['max_pooling2d[0][0]']
batch_normalization_4 (Bat (None, 54, 54, 80) 240 ['conv2d_3[0][0]']
chNormalization)
activation_3 (Activation) (None, 54, 54, 80) 0 ['batch_normalization_4[0][0]'
]
conv2d_4 (Conv2D) (None, 52, 52, 192) 138240 ['activation_3[0][0]']
batch_normalization_5 (Bat (None, 52, 52, 192) 576 ['conv2d_4[0][0]']
chNormalization)
activation_4 (Activation) (None, 52, 52, 192) 0 ['batch_normalization_5[0][0]'
]
max_pooling2d_1 (MaxPoolin (None, 25, 25, 192) 0 ['activation_4[0][0]']
g2D)
conv2d_8 (Conv2D) (None, 25, 25, 64) 12288 ['max_pooling2d_1[0][0]']
batch_normalization_9 (Bat (None, 25, 25, 64) 192 ['conv2d_8[0][0]']
chNormalization)
activation_8 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_9[0][0]'
]
conv2d_6 (Conv2D) (None, 25, 25, 48) 9216 ['max_pooling2d_1[0][0]']
conv2d_9 (Conv2D) (None, 25, 25, 96) 55296 ['activation_8[0][0]']
batch_normalization_7 (Bat (None, 25, 25, 48) 144 ['conv2d_6[0][0]']
chNormalization)
batch_normalization_10 (Ba (None, 25, 25, 96) 288 ['conv2d_9[0][0]']
tchNormalization)
activation_6 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_7[0][0]'
]
activation_9 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_10[0][0]
']
average_pooling2d (Average (None, 25, 25, 192) 0 ['max_pooling2d_1[0][0]']
Pooling2D)
conv2d_5 (Conv2D) (None, 25, 25, 64) 12288 ['max_pooling2d_1[0][0]']
conv2d_7 (Conv2D) (None, 25, 25, 64) 76800 ['activation_6[0][0]']
conv2d_10 (Conv2D) (None, 25, 25, 96) 82944 ['activation_9[0][0]']
conv2d_11 (Conv2D) (None, 25, 25, 32) 6144 ['average_pooling2d[0][0]']
batch_normalization_6 (Bat (None, 25, 25, 64) 192 ['conv2d_5[0][0]']
chNormalization)
batch_normalization_8 (Bat (None, 25, 25, 64) 192 ['conv2d_7[0][0]']
chNormalization)
batch_normalization_11 (Ba (None, 25, 25, 96) 288 ['conv2d_10[0][0]']
tchNormalization)
batch_normalization_12 (Ba (None, 25, 25, 32) 96 ['conv2d_11[0][0]']
tchNormalization)
activation_5 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_6[0][0]'
]
activation_7 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_8[0][0]'
]
activation_10 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_11[0][0]
']
activation_11 (Activation) (None, 25, 25, 32) 0 ['batch_normalization_12[0][0]
']
mixed0 (Concatenate) (None, 25, 25, 256) 0 ['activation_5[0][0]',
'activation_7[0][0]',
'activation_10[0][0]',
'activation_11[0][0]']
conv2d_15 (Conv2D) (None, 25, 25, 64) 16384 ['mixed0[0][0]']
batch_normalization_16 (Ba (None, 25, 25, 64) 192 ['conv2d_15[0][0]']
tchNormalization)
activation_15 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_16[0][0]
']
conv2d_13 (Conv2D) (None, 25, 25, 48) 12288 ['mixed0[0][0]']
conv2d_16 (Conv2D) (None, 25, 25, 96) 55296 ['activation_15[0][0]']
batch_normalization_14 (Ba (None, 25, 25, 48) 144 ['conv2d_13[0][0]']
tchNormalization)
batch_normalization_17 (Ba (None, 25, 25, 96) 288 ['conv2d_16[0][0]']
tchNormalization)
activation_13 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_14[0][0]
']
activation_16 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_17[0][0]
']
average_pooling2d_1 (Avera (None, 25, 25, 256) 0 ['mixed0[0][0]']
gePooling2D)
conv2d_12 (Conv2D) (None, 25, 25, 64) 16384 ['mixed0[0][0]']
conv2d_14 (Conv2D) (None, 25, 25, 64) 76800 ['activation_13[0][0]']
conv2d_17 (Conv2D) (None, 25, 25, 96) 82944 ['activation_16[0][0]']
conv2d_18 (Conv2D) (None, 25, 25, 64) 16384 ['average_pooling2d_1[0][0]']
batch_normalization_13 (Ba (None, 25, 25, 64) 192 ['conv2d_12[0][0]']
tchNormalization)
batch_normalization_15 (Ba (None, 25, 25, 64) 192 ['conv2d_14[0][0]']
tchNormalization)
batch_normalization_18 (Ba (None, 25, 25, 96) 288 ['conv2d_17[0][0]']
tchNormalization)
batch_normalization_19 (Ba (None, 25, 25, 64) 192 ['conv2d_18[0][0]']
tchNormalization)
activation_12 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_13[0][0]
']
activation_14 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_15[0][0]
']
activation_17 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_18[0][0]
']
activation_18 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_19[0][0]
']
mixed1 (Concatenate) (None, 25, 25, 288) 0 ['activation_12[0][0]',
'activation_14[0][0]',
'activation_17[0][0]',
'activation_18[0][0]']
conv2d_22 (Conv2D) (None, 25, 25, 64) 18432 ['mixed1[0][0]']
batch_normalization_23 (Ba (None, 25, 25, 64) 192 ['conv2d_22[0][0]']
tchNormalization)
activation_22 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_23[0][0]
']
conv2d_20 (Conv2D) (None, 25, 25, 48) 13824 ['mixed1[0][0]']
conv2d_23 (Conv2D) (None, 25, 25, 96) 55296 ['activation_22[0][0]']
batch_normalization_21 (Ba (None, 25, 25, 48) 144 ['conv2d_20[0][0]']
tchNormalization)
batch_normalization_24 (Ba (None, 25, 25, 96) 288 ['conv2d_23[0][0]']
tchNormalization)
activation_20 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_21[0][0]
']
activation_23 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_24[0][0]
']
average_pooling2d_2 (Avera (None, 25, 25, 288) 0 ['mixed1[0][0]']
gePooling2D)
conv2d_19 (Conv2D) (None, 25, 25, 64) 18432 ['mixed1[0][0]']
conv2d_21 (Conv2D) (None, 25, 25, 64) 76800 ['activation_20[0][0]']
conv2d_24 (Conv2D) (None, 25, 25, 96) 82944 ['activation_23[0][0]']
conv2d_25 (Conv2D) (None, 25, 25, 64) 18432 ['average_pooling2d_2[0][0]']
batch_normalization_20 (Ba (None, 25, 25, 64) 192 ['conv2d_19[0][0]']
tchNormalization)
batch_normalization_22 (Ba (None, 25, 25, 64) 192 ['conv2d_21[0][0]']
tchNormalization)
batch_normalization_25 (Ba (None, 25, 25, 96) 288 ['conv2d_24[0][0]']
tchNormalization)
batch_normalization_26 (Ba (None, 25, 25, 64) 192 ['conv2d_25[0][0]']
tchNormalization)
activation_19 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_20[0][0]
']
activation_21 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_22[0][0]
']
activation_24 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_25[0][0]
']
activation_25 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_26[0][0]
']
mixed2 (Concatenate) (None, 25, 25, 288) 0 ['activation_19[0][0]',
'activation_21[0][0]',
'activation_24[0][0]',
'activation_25[0][0]']
conv2d_27 (Conv2D) (None, 25, 25, 64) 18432 ['mixed2[0][0]']
batch_normalization_28 (Ba (None, 25, 25, 64) 192 ['conv2d_27[0][0]']
tchNormalization)
activation_27 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_28[0][0]
']
conv2d_28 (Conv2D) (None, 25, 25, 96) 55296 ['activation_27[0][0]']
batch_normalization_29 (Ba (None, 25, 25, 96) 288 ['conv2d_28[0][0]']
tchNormalization)
activation_28 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_29[0][0]
']
conv2d_26 (Conv2D) (None, 12, 12, 384) 995328 ['mixed2[0][0]']
conv2d_29 (Conv2D) (None, 12, 12, 96) 82944 ['activation_28[0][0]']
batch_normalization_27 (Ba (None, 12, 12, 384) 1152 ['conv2d_26[0][0]']
tchNormalization)
batch_normalization_30 (Ba (None, 12, 12, 96) 288 ['conv2d_29[0][0]']
tchNormalization)
activation_26 (Activation) (None, 12, 12, 384) 0 ['batch_normalization_27[0][0]
']
activation_29 (Activation) (None, 12, 12, 96) 0 ['batch_normalization_30[0][0]
']
max_pooling2d_2 (MaxPoolin (None, 12, 12, 288) 0 ['mixed2[0][0]']
g2D)
mixed3 (Concatenate) (None, 12, 12, 768) 0 ['activation_26[0][0]',
'activation_29[0][0]',
'max_pooling2d_2[0][0]']
conv2d_34 (Conv2D) (None, 12, 12, 128) 98304 ['mixed3[0][0]']
batch_normalization_35 (Ba (None, 12, 12, 128) 384 ['conv2d_34[0][0]']
tchNormalization)
activation_34 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_35[0][0]
']
conv2d_35 (Conv2D) (None, 12, 12, 128) 114688 ['activation_34[0][0]']
batch_normalization_36 (Ba (None, 12, 12, 128) 384 ['conv2d_35[0][0]']
tchNormalization)
activation_35 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_36[0][0]
']
conv2d_31 (Conv2D) (None, 12, 12, 128) 98304 ['mixed3[0][0]']
conv2d_36 (Conv2D) (None, 12, 12, 128) 114688 ['activation_35[0][0]']
batch_normalization_32 (Ba (None, 12, 12, 128) 384 ['conv2d_31[0][0]']
tchNormalization)
batch_normalization_37 (Ba (None, 12, 12, 128) 384 ['conv2d_36[0][0]']
tchNormalization)
activation_31 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_32[0][0]
']
activation_36 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_37[0][0]
']
conv2d_32 (Conv2D) (None, 12, 12, 128) 114688 ['activation_31[0][0]']
conv2d_37 (Conv2D) (None, 12, 12, 128) 114688 ['activation_36[0][0]']
batch_normalization_33 (Ba (None, 12, 12, 128) 384 ['conv2d_32[0][0]']
tchNormalization)
batch_normalization_38 (Ba (None, 12, 12, 128) 384 ['conv2d_37[0][0]']
tchNormalization)
activation_32 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_33[0][0]
']
activation_37 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_38[0][0]
']
average_pooling2d_3 (Avera (None, 12, 12, 768) 0 ['mixed3[0][0]']
gePooling2D)
conv2d_30 (Conv2D) (None, 12, 12, 192) 147456 ['mixed3[0][0]']
conv2d_33 (Conv2D) (None, 12, 12, 192) 172032 ['activation_32[0][0]']
conv2d_38 (Conv2D) (None, 12, 12, 192) 172032 ['activation_37[0][0]']
conv2d_39 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_3[0][0]']
batch_normalization_31 (Ba (None, 12, 12, 192) 576 ['conv2d_30[0][0]']
tchNormalization)
batch_normalization_34 (Ba (None, 12, 12, 192) 576 ['conv2d_33[0][0]']
tchNormalization)
batch_normalization_39 (Ba (None, 12, 12, 192) 576 ['conv2d_38[0][0]']
tchNormalization)
batch_normalization_40 (Ba (None, 12, 12, 192) 576 ['conv2d_39[0][0]']
tchNormalization)
activation_30 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_31[0][0]
']
activation_33 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_34[0][0]
']
activation_38 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_39[0][0]
']
activation_39 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_40[0][0]
']
mixed4 (Concatenate) (None, 12, 12, 768) 0 ['activation_30[0][0]',
'activation_33[0][0]',
'activation_38[0][0]',
'activation_39[0][0]']
conv2d_44 (Conv2D) (None, 12, 12, 160) 122880 ['mixed4[0][0]']
batch_normalization_45 (Ba (None, 12, 12, 160) 480 ['conv2d_44[0][0]']
tchNormalization)
activation_44 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_45[0][0]
']
conv2d_45 (Conv2D) (None, 12, 12, 160) 179200 ['activation_44[0][0]']
batch_normalization_46 (Ba (None, 12, 12, 160) 480 ['conv2d_45[0][0]']
tchNormalization)
activation_45 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_46[0][0]
']
conv2d_41 (Conv2D) (None, 12, 12, 160) 122880 ['mixed4[0][0]']
conv2d_46 (Conv2D) (None, 12, 12, 160) 179200 ['activation_45[0][0]']
batch_normalization_42 (Ba (None, 12, 12, 160) 480 ['conv2d_41[0][0]']
tchNormalization)
batch_normalization_47 (Ba (None, 12, 12, 160) 480 ['conv2d_46[0][0]']
tchNormalization)
activation_41 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_42[0][0]
']
activation_46 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_47[0][0]
']
conv2d_42 (Conv2D) (None, 12, 12, 160) 179200 ['activation_41[0][0]']
conv2d_47 (Conv2D) (None, 12, 12, 160) 179200 ['activation_46[0][0]']
batch_normalization_43 (Ba (None, 12, 12, 160) 480 ['conv2d_42[0][0]']
tchNormalization)
batch_normalization_48 (Ba (None, 12, 12, 160) 480 ['conv2d_47[0][0]']
tchNormalization)
activation_42 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_43[0][0]
']
activation_47 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_48[0][0]
']
average_pooling2d_4 (Avera (None, 12, 12, 768) 0 ['mixed4[0][0]']
gePooling2D)
conv2d_40 (Conv2D) (None, 12, 12, 192) 147456 ['mixed4[0][0]']
conv2d_43 (Conv2D) (None, 12, 12, 192) 215040 ['activation_42[0][0]']
conv2d_48 (Conv2D) (None, 12, 12, 192) 215040 ['activation_47[0][0]']
conv2d_49 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_4[0][0]']
batch_normalization_41 (Ba (None, 12, 12, 192) 576 ['conv2d_40[0][0]']
tchNormalization)
batch_normalization_44 (Ba (None, 12, 12, 192) 576 ['conv2d_43[0][0]']
tchNormalization)
batch_normalization_49 (Ba (None, 12, 12, 192) 576 ['conv2d_48[0][0]']
tchNormalization)
batch_normalization_50 (Ba (None, 12, 12, 192) 576 ['conv2d_49[0][0]']
tchNormalization)
activation_40 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_41[0][0]
']
activation_43 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_44[0][0]
']
activation_48 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_49[0][0]
']
activation_49 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_50[0][0]
']
mixed5 (Concatenate) (None, 12, 12, 768) 0 ['activation_40[0][0]',
'activation_43[0][0]',
'activation_48[0][0]',
'activation_49[0][0]']
conv2d_54 (Conv2D) (None, 12, 12, 160) 122880 ['mixed5[0][0]']
batch_normalization_55 (Ba (None, 12, 12, 160) 480 ['conv2d_54[0][0]']
tchNormalization)
activation_54 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_55[0][0]
']
conv2d_55 (Conv2D) (None, 12, 12, 160) 179200 ['activation_54[0][0]']
batch_normalization_56 (Ba (None, 12, 12, 160) 480 ['conv2d_55[0][0]']
tchNormalization)
activation_55 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_56[0][0]
']
conv2d_51 (Conv2D) (None, 12, 12, 160) 122880 ['mixed5[0][0]']
conv2d_56 (Conv2D) (None, 12, 12, 160) 179200 ['activation_55[0][0]']
batch_normalization_52 (Ba (None, 12, 12, 160) 480 ['conv2d_51[0][0]']
tchNormalization)
batch_normalization_57 (Ba (None, 12, 12, 160) 480 ['conv2d_56[0][0]']
tchNormalization)
activation_51 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_52[0][0]
']
activation_56 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_57[0][0]
']
conv2d_52 (Conv2D) (None, 12, 12, 160) 179200 ['activation_51[0][0]']
conv2d_57 (Conv2D) (None, 12, 12, 160) 179200 ['activation_56[0][0]']
batch_normalization_53 (Ba (None, 12, 12, 160) 480 ['conv2d_52[0][0]']
tchNormalization)
batch_normalization_58 (Ba (None, 12, 12, 160) 480 ['conv2d_57[0][0]']
tchNormalization)
activation_52 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_53[0][0]
']
activation_57 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_58[0][0]
']
average_pooling2d_5 (Avera (None, 12, 12, 768) 0 ['mixed5[0][0]']
gePooling2D)
conv2d_50 (Conv2D) (None, 12, 12, 192) 147456 ['mixed5[0][0]']
conv2d_53 (Conv2D) (None, 12, 12, 192) 215040 ['activation_52[0][0]']
conv2d_58 (Conv2D) (None, 12, 12, 192) 215040 ['activation_57[0][0]']
conv2d_59 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_5[0][0]']
batch_normalization_51 (Ba (None, 12, 12, 192) 576 ['conv2d_50[0][0]']
tchNormalization)
batch_normalization_54 (Ba (None, 12, 12, 192) 576 ['conv2d_53[0][0]']
tchNormalization)
batch_normalization_59 (Ba (None, 12, 12, 192) 576 ['conv2d_58[0][0]']
tchNormalization)
batch_normalization_60 (Ba (None, 12, 12, 192) 576 ['conv2d_59[0][0]']
tchNormalization)
activation_50 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_51[0][0]
']
activation_53 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_54[0][0]
']
activation_58 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_59[0][0]
']
activation_59 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_60[0][0]
']
mixed6 (Concatenate) (None, 12, 12, 768) 0 ['activation_50[0][0]',
'activation_53[0][0]',
'activation_58[0][0]',
'activation_59[0][0]']
conv2d_64 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
batch_normalization_65 (Ba (None, 12, 12, 192) 576 ['conv2d_64[0][0]']
tchNormalization)
activation_64 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_65[0][0]
']
conv2d_65 (Conv2D) (None, 12, 12, 192) 258048 ['activation_64[0][0]']
batch_normalization_66 (Ba (None, 12, 12, 192) 576 ['conv2d_65[0][0]']
tchNormalization)
activation_65 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_66[0][0]
']
conv2d_61 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
conv2d_66 (Conv2D) (None, 12, 12, 192) 258048 ['activation_65[0][0]']
batch_normalization_62 (Ba (None, 12, 12, 192) 576 ['conv2d_61[0][0]']
tchNormalization)
batch_normalization_67 (Ba (None, 12, 12, 192) 576 ['conv2d_66[0][0]']
tchNormalization)
activation_61 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_62[0][0]
']
activation_66 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_67[0][0]
']
conv2d_62 (Conv2D) (None, 12, 12, 192) 258048 ['activation_61[0][0]']
conv2d_67 (Conv2D) (None, 12, 12, 192) 258048 ['activation_66[0][0]']
batch_normalization_63 (Ba (None, 12, 12, 192) 576 ['conv2d_62[0][0]']
tchNormalization)
batch_normalization_68 (Ba (None, 12, 12, 192) 576 ['conv2d_67[0][0]']
tchNormalization)
activation_62 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_63[0][0]
']
activation_67 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_68[0][0]
']
average_pooling2d_6 (Avera (None, 12, 12, 768) 0 ['mixed6[0][0]']
gePooling2D)
conv2d_60 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
conv2d_63 (Conv2D) (None, 12, 12, 192) 258048 ['activation_62[0][0]']
conv2d_68 (Conv2D) (None, 12, 12, 192) 258048 ['activation_67[0][0]']
conv2d_69 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_6[0][0]']
batch_normalization_61 (Ba (None, 12, 12, 192) 576 ['conv2d_60[0][0]']
tchNormalization)
batch_normalization_64 (Ba (None, 12, 12, 192) 576 ['conv2d_63[0][0]']
tchNormalization)
batch_normalization_69 (Ba (None, 12, 12, 192) 576 ['conv2d_68[0][0]']
tchNormalization)
batch_normalization_70 (Ba (None, 12, 12, 192) 576 ['conv2d_69[0][0]']
tchNormalization)
activation_60 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_61[0][0]
']
activation_63 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_64[0][0]
']
activation_68 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_69[0][0]
']
activation_69 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_70[0][0]
']
mixed7 (Concatenate) (None, 12, 12, 768) 0 ['activation_60[0][0]',
'activation_63[0][0]',
'activation_68[0][0]',
'activation_69[0][0]']
conv2d_72 (Conv2D) (None, 12, 12, 192) 147456 ['mixed7[0][0]']
batch_normalization_73 (Ba (None, 12, 12, 192) 576 ['conv2d_72[0][0]']
tchNormalization)
activation_72 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_73[0][0]
']
conv2d_73 (Conv2D) (None, 12, 12, 192) 258048 ['activation_72[0][0]']
batch_normalization_74 (Ba (None, 12, 12, 192) 576 ['conv2d_73[0][0]']
tchNormalization)
activation_73 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_74[0][0]
']
conv2d_70 (Conv2D) (None, 12, 12, 192) 147456 ['mixed7[0][0]']
conv2d_74 (Conv2D) (None, 12, 12, 192) 258048 ['activation_73[0][0]']
batch_normalization_71 (Ba (None, 12, 12, 192) 576 ['conv2d_70[0][0]']
tchNormalization)
batch_normalization_75 (Ba (None, 12, 12, 192) 576 ['conv2d_74[0][0]']
tchNormalization)
activation_70 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_71[0][0]
']
activation_74 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_75[0][0]
']
conv2d_71 (Conv2D) (None, 5, 5, 320) 552960 ['activation_70[0][0]']
conv2d_75 (Conv2D) (None, 5, 5, 192) 331776 ['activation_74[0][0]']
batch_normalization_72 (Ba (None, 5, 5, 320) 960 ['conv2d_71[0][0]']
tchNormalization)
batch_normalization_76 (Ba (None, 5, 5, 192) 576 ['conv2d_75[0][0]']
tchNormalization)
activation_71 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_72[0][0]
']
activation_75 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_76[0][0]
']
max_pooling2d_3 (MaxPoolin (None, 5, 5, 768) 0 ['mixed7[0][0]']
g2D)
mixed8 (Concatenate) (None, 5, 5, 1280) 0 ['activation_71[0][0]',
'activation_75[0][0]',
'max_pooling2d_3[0][0]']
conv2d_80 (Conv2D) (None, 5, 5, 448) 573440 ['mixed8[0][0]']
batch_normalization_81 (Ba (None, 5, 5, 448) 1344 ['conv2d_80[0][0]']
tchNormalization)
activation_80 (Activation) (None, 5, 5, 448) 0 ['batch_normalization_81[0][0]
']
conv2d_77 (Conv2D) (None, 5, 5, 384) 491520 ['mixed8[0][0]']
conv2d_81 (Conv2D) (None, 5, 5, 384) 1548288 ['activation_80[0][0]']
batch_normalization_78 (Ba (None, 5, 5, 384) 1152 ['conv2d_77[0][0]']
tchNormalization)
batch_normalization_82 (Ba (None, 5, 5, 384) 1152 ['conv2d_81[0][0]']
tchNormalization)
activation_77 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_78[0][0]
']
activation_81 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_82[0][0]
']
conv2d_78 (Conv2D) (None, 5, 5, 384) 442368 ['activation_77[0][0]']
conv2d_79 (Conv2D) (None, 5, 5, 384) 442368 ['activation_77[0][0]']
conv2d_82 (Conv2D) (None, 5, 5, 384) 442368 ['activation_81[0][0]']
conv2d_83 (Conv2D) (None, 5, 5, 384) 442368 ['activation_81[0][0]']
average_pooling2d_7 (Avera (None, 5, 5, 1280) 0 ['mixed8[0][0]']
gePooling2D)
conv2d_76 (Conv2D) (None, 5, 5, 320) 409600 ['mixed8[0][0]']
batch_normalization_79 (Ba (None, 5, 5, 384) 1152 ['conv2d_78[0][0]']
tchNormalization)
batch_normalization_80 (Ba (None, 5, 5, 384) 1152 ['conv2d_79[0][0]']
tchNormalization)
batch_normalization_83 (Ba (None, 5, 5, 384) 1152 ['conv2d_82[0][0]']
tchNormalization)
batch_normalization_84 (Ba (None, 5, 5, 384) 1152 ['conv2d_83[0][0]']
tchNormalization)
conv2d_84 (Conv2D) (None, 5, 5, 192) 245760 ['average_pooling2d_7[0][0]']
batch_normalization_77 (Ba (None, 5, 5, 320) 960 ['conv2d_76[0][0]']
tchNormalization)
activation_78 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_79[0][0]
']
activation_79 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_80[0][0]
']
activation_82 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_83[0][0]
']
activation_83 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_84[0][0]
']
batch_normalization_85 (Ba (None, 5, 5, 192) 576 ['conv2d_84[0][0]']
tchNormalization)
activation_76 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_77[0][0]
']
mixed9_0 (Concatenate) (None, 5, 5, 768) 0 ['activation_78[0][0]',
'activation_79[0][0]']
concatenate (Concatenate) (None, 5, 5, 768) 0 ['activation_82[0][0]',
'activation_83[0][0]']
activation_84 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_85[0][0]
']
mixed9 (Concatenate) (None, 5, 5, 2048) 0 ['activation_76[0][0]',
'mixed9_0[0][0]',
'concatenate[0][0]',
'activation_84[0][0]']
conv2d_89 (Conv2D) (None, 5, 5, 448) 917504 ['mixed9[0][0]']
batch_normalization_90 (Ba (None, 5, 5, 448) 1344 ['conv2d_89[0][0]']
tchNormalization)
activation_89 (Activation) (None, 5, 5, 448) 0 ['batch_normalization_90[0][0]
']
conv2d_86 (Conv2D) (None, 5, 5, 384) 786432 ['mixed9[0][0]']
conv2d_90 (Conv2D) (None, 5, 5, 384) 1548288 ['activation_89[0][0]']
batch_normalization_87 (Ba (None, 5, 5, 384) 1152 ['conv2d_86[0][0]']
tchNormalization)
batch_normalization_91 (Ba (None, 5, 5, 384) 1152 ['conv2d_90[0][0]']
tchNormalization)
activation_86 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_87[0][0]
']
activation_90 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_91[0][0]
']
conv2d_87 (Conv2D) (None, 5, 5, 384) 442368 ['activation_86[0][0]']
conv2d_88 (Conv2D) (None, 5, 5, 384) 442368 ['activation_86[0][0]']
conv2d_91 (Conv2D) (None, 5, 5, 384) 442368 ['activation_90[0][0]']
conv2d_92 (Conv2D) (None, 5, 5, 384) 442368 ['activation_90[0][0]']
average_pooling2d_8 (Avera (None, 5, 5, 2048) 0 ['mixed9[0][0]']
gePooling2D)
conv2d_85 (Conv2D) (None, 5, 5, 320) 655360 ['mixed9[0][0]']
batch_normalization_88 (Ba (None, 5, 5, 384) 1152 ['conv2d_87[0][0]']
tchNormalization)
batch_normalization_89 (Ba (None, 5, 5, 384) 1152 ['conv2d_88[0][0]']
tchNormalization)
batch_normalization_92 (Ba (None, 5, 5, 384) 1152 ['conv2d_91[0][0]']
tchNormalization)
batch_normalization_93 (Ba (None, 5, 5, 384) 1152 ['conv2d_92[0][0]']
tchNormalization)
conv2d_93 (Conv2D) (None, 5, 5, 192) 393216 ['average_pooling2d_8[0][0]']
batch_normalization_86 (Ba (None, 5, 5, 320) 960 ['conv2d_85[0][0]']
tchNormalization)
activation_87 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_88[0][0]
']
activation_88 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_89[0][0]
']
activation_91 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_92[0][0]
']
activation_92 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_93[0][0]
']
batch_normalization_94 (Ba (None, 5, 5, 192) 576 ['conv2d_93[0][0]']
tchNormalization)
activation_85 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_86[0][0]
']
mixed9_1 (Concatenate) (None, 5, 5, 768) 0 ['activation_87[0][0]',
'activation_88[0][0]']
concatenate_1 (Concatenate (None, 5, 5, 768) 0 ['activation_91[0][0]',
) 'activation_92[0][0]']
activation_93 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_94[0][0]
']
mixed10 (Concatenate) (None, 5, 5, 2048) 0 ['activation_85[0][0]',
'mixed9_1[0][0]',
'concatenate_1[0][0]',
'activation_93[0][0]']
global_average_pooling2d_1 (None, 2048) 0 ['mixed10[0][0]']
(GlobalAveragePooling2D)
dense_2 (Dense) (None, 1024) 2098176 ['global_average_pooling2d_1[0
][0]']
dense_3 (Dense) (None, 196) 200900 ['dense_2[0][0]']
==================================================================================================
Total params: 24101860 (91.94 MB)
Trainable params: 2299076 (8.77 MB)
Non-trainable params: 21802784 (83.17 MB)
__________________________________________________________________________________________________
# Convert to numpy arrays
# Ensure all images have the same shape before stacking
# Ensure the model summary is called after defining the model
googlenet_batch_size=16
history_googlenet= googlenet_model.fit(
train_generator, # Uses batches from the generator
steps_per_epoch=len(df_train) // googlenet_batch_size, # Number of batches per epoch
epochs=10,
validation_data=val_generator, # Uses batches from the validation generator
validation_steps=len(df_val) // googlenet_batch_size, # Number of validation batches per epoch
)
Epoch 1/10 407/407 [==============================] - 53s 86ms/step - loss: 4.4657 - accuracy: 0.0585 - val_loss: 3.8620 - val_accuracy: 0.1017 Epoch 2/10 407/407 [==============================] - 26s 64ms/step - loss: 3.4662 - accuracy: 0.1531 - val_loss: 3.5989 - val_accuracy: 0.1426 Epoch 3/10 407/407 [==============================] - 26s 65ms/step - loss: 3.0104 - accuracy: 0.2356 - val_loss: 3.4258 - val_accuracy: 0.1841 Epoch 4/10 407/407 [==============================] - 26s 65ms/step - loss: 2.6549 - accuracy: 0.3205 - val_loss: 3.3805 - val_accuracy: 0.1866 Epoch 5/10 407/407 [==============================] - 26s 63ms/step - loss: 2.3704 - accuracy: 0.3773 - val_loss: 3.3718 - val_accuracy: 0.2015 Epoch 6/10 407/407 [==============================] - 25s 61ms/step - loss: 2.1265 - accuracy: 0.4351 - val_loss: 3.3983 - val_accuracy: 0.2126 Epoch 7/10 407/407 [==============================] - 23s 57ms/step - loss: 1.9191 - accuracy: 0.4816 - val_loss: 3.4501 - val_accuracy: 0.2139 Epoch 8/10 407/407 [==============================] - 21s 53ms/step - loss: 1.7333 - accuracy: 0.5312 - val_loss: 3.5264 - val_accuracy: 0.2219 Epoch 9/10 407/407 [==============================] - 21s 52ms/step - loss: 1.5608 - accuracy: 0.5769 - val_loss: 3.6023 - val_accuracy: 0.2343 Epoch 10/10 407/407 [==============================] - 21s 52ms/step - loss: 1.4138 - accuracy: 0.6175 - val_loss: 3.6964 - val_accuracy: 0.2393
#display model accuracy vs model loss
plot_training_history(history_googlenet)
y_pred, y_true,df_googlenet_classification_report = generate_classification_report_tf_model(
model=googlenet_model,
df_val=df_val,
label_encoder=label_encoder,
preprocess_fn=googlenet_preprocess,
batch_size=32,
report_name="googlenet_classification_report.csv"
)
51/51 [==============================] - 10s 43ms/step
Model Accuracy: 0.2406
Classification Report:
Report saved as: googlenet_classification_report.csv
Model Accuracy: 0.2406
Average Summary Metrics:
precision recall f1-score
macro avg 0.357037 0.240374 0.219307
weighted avg 0.384768 0.240638 0.231432
overall_accuracy 0.240638 NaN NaN
Displaying top 10 of googlenet in confusion matrix
df_support = df_googlenet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
googlenet_cm = confusion_matrix(y_true, y_pred)
googlenet_cm_top10 = googlenet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(googlenet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("GoogleNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
The inception modules allow the model to learn features at different scales, which can be beneficial for detecting cars of various sizes and orientations.
Training Accuracy: Steadily increases, reaching ~75%. Validation Accuracy: Stagnates around 20-25%, indicating poor generalization. Training Loss: Decreases smoothly, showing effective learning on training data.
Validation Loss: Plateaus and increases after a few epochs, a sign of overfitting.
Classification Report Analysis: Overall Accuracy: 25%, indicating poor performance on validation data. Precision: ~41% Recall: ~26% (Very Low) F1-Score: ~24%
Key Issues Identified: Indicates skewed performance, likely due to class imbalance. Overfitting. High Bias (Poor Performance on Validation Data) Potential Class Imbalance
6C. AlexNet
# Define paths
#image_dir = 'Car_Images/Car Images/Test Images' # Adjust based on your directory structure
image_dir = 'car_data/car_data/test'
# Prepare data
images = []
labels = []
for index, row in test_annotations_df.iterrows():
#image_name = row['Image Name']
image_name = row['image_name']
# Load and preprocess the image
image = cv2.imread(image_path)
image = cv2.resize(image, (227, 227)) # Resize to 227x227 pixels (AlexNet input size)
images.append(image)
# Assuming 'Image class' contains the class label
#labels.append(row['Image class'])
labels.append(row['image_class'])
# Convert to numpy arrays
images = np.array(images)
labels = np.array(labels)
# Encode labels
unique_classes = np.unique(labels)
def create_alexnet_model(input_shape, num_classes):
model = Sequential()
# First Convolutional Layer
model.add(Conv2D(96, (11, 11), strides=(4, 4), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Second Convolutional Layer
model.add(Conv2D(256, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Third Convolutional Layer
model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))
# Fourth Convolutional Layer
model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))
# Fifth Convolutional Layer
model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Flatten the output
model.add(Flatten())
# Fully Connected Layers
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
return model
# Create the model
input_shape = (224, 224, 3) # Image dimensions for AlexNet
#num_classes = len(unique_classes)
num_classes = len(df_training['labels'].unique())
model = create_alexnet_model(input_shape, num_classes)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Data augmentation
datagen = ImageDataGenerator(rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2, shear_range=0.2,
zoom_range=0.2, horizontal_flip=True,
fill_mode='nearest')
epochs=10
#batch_size=32
batch_size=16
train_steps = len(df_train) // batch_size
val_steps = len(df_val) // batch_size
model.summary()
alexnet_history = model.fit(
train_generator,
steps_per_epoch = train_steps,
epochs=epochs,
batch_size=32,
validation_data=val_generator,
validation_steps=val_steps
)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_94 (Conv2D) (None, 54, 54, 96) 34944
max_pooling2d_4 (MaxPoolin (None, 26, 26, 96) 0
g2D)
batch_normalization_95 (Ba (None, 26, 26, 96) 384
tchNormalization)
conv2d_95 (Conv2D) (None, 26, 26, 256) 614656
max_pooling2d_5 (MaxPoolin (None, 12, 12, 256) 0
g2D)
batch_normalization_96 (Ba (None, 12, 12, 256) 1024
tchNormalization)
conv2d_96 (Conv2D) (None, 12, 12, 384) 885120
conv2d_97 (Conv2D) (None, 12, 12, 384) 1327488
conv2d_98 (Conv2D) (None, 12, 12, 256) 884992
max_pooling2d_6 (MaxPoolin (None, 5, 5, 256) 0
g2D)
batch_normalization_97 (Ba (None, 5, 5, 256) 1024
tchNormalization)
flatten (Flatten) (None, 6400) 0
dense_4 (Dense) (None, 4096) 26218496
dropout_1 (Dropout) (None, 4096) 0
dense_5 (Dense) (None, 4096) 16781312
dropout_2 (Dropout) (None, 4096) 0
dense_6 (Dense) (None, 196) 803012
=================================================================
Total params: 47552452 (181.40 MB)
Trainable params: 47551236 (181.39 MB)
Non-trainable params: 1216 (4.75 KB)
_________________________________________________________________
Epoch 1/10
407/407 [==============================] - 32s 67ms/step - loss: 5.5415 - accuracy: 0.0051 - val_loss: 5.2992 - val_accuracy: 0.0087
Epoch 2/10
407/407 [==============================] - 26s 64ms/step - loss: 5.2811 - accuracy: 0.0068 - val_loss: 5.2869 - val_accuracy: 0.0087
Epoch 3/10
407/407 [==============================] - 26s 63ms/step - loss: 5.2773 - accuracy: 0.0082 - val_loss: 5.2899 - val_accuracy: 0.0087
Epoch 4/10
407/407 [==============================] - 25s 62ms/step - loss: 5.2803 - accuracy: 0.0083 - val_loss: 5.2937 - val_accuracy: 0.0087
Epoch 5/10
407/407 [==============================] - 25s 60ms/step - loss: 5.2759 - accuracy: 0.0083 - val_loss: 5.2935 - val_accuracy: 0.0087
Epoch 6/10
407/407 [==============================] - 23s 57ms/step - loss: 5.2741 - accuracy: 0.0083 - val_loss: 5.2959 - val_accuracy: 0.0087
Epoch 7/10
407/407 [==============================] - 21s 52ms/step - loss: 5.2738 - accuracy: 0.0083 - val_loss: 5.2961 - val_accuracy: 0.0087
Epoch 8/10
407/407 [==============================] - 21s 51ms/step - loss: 5.2734 - accuracy: 0.0083 - val_loss: 5.2964 - val_accuracy: 0.0087
Epoch 9/10
407/407 [==============================] - 21s 52ms/step - loss: 5.2731 - accuracy: 0.0083 - val_loss: 5.2975 - val_accuracy: 0.0087
Epoch 10/10
407/407 [==============================] - 21s 52ms/step - loss: 5.2725 - accuracy: 0.0083 - val_loss: 5.2986 - val_accuracy: 0.0087
#display model accuracy vs loss
plot_training_history(alexnet_history)
X_val = np.array([img for img in df_val['image']])
y_val_true = np.array([np.argmax(label) for label in df_val['label_categorical']])
# Predict in one go
y_val_pred = np.argmax(model.predict(X_val), axis=1)
51/51 [==============================] - 1s 11ms/step
alexnet_report = classification_report(
y_val_true,
y_val_pred,
target_names=label_encoder.classes_,
output_dict=True,
zero_division=1
)
acc = accuracy_score(y_val_true, y_val_pred)
df_alexnet_classification_report = pd.DataFrame(alexnet_report).transpose()
df_alexnet_classification_report.loc["overall_accuracy"] = [acc, None, None, None]
df_alexnet_classification_report.to_csv("alexnet_classification_report_vectorized.csv")
print(f"Accuracy Score: {acc:.4f}")
print("Average Summary Metrics:")
print(df_alexnet_classification_report.tail(3)[["precision", "recall", "f1-score"]])
Accuracy Score: 0.0086
Average Summary Metrics:
precision recall f1-score
macro avg 0.989840 0.005102 0.000087
weighted avg 0.987796 0.008594 0.000147
overall_accuracy 0.008594 NaN NaN
# Print Classification Report
#print("Classification Report:")
#print(classification_report(y_val_true, y_val_pred, target_names=df_training['labels'].unique(), zero_division=0))
confusion metrics
df_support = df_alexnet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
alexnet_cm = confusion_matrix(y_val_true, y_val_pred)
alexnet_cm_top10 = alexnet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(alexnet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("AlexNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
Further Actions that can be taken are
6D. ResNet
# Load ResNet50 base model without the top layer
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze base model layers
for layer in base_model.layers:
layer.trainable = False
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(len(df_training['labels_encoded'].unique()), activation='softmax')(x) # Output layer
# Define model
resnet_model = Model(inputs=base_model.input, outputs=x)
# Compile model
resnet_model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
# Print model summary
resnet_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94765736/94765736 [==============================] - 7s 0us/step
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 ['input_3[0][0]']
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalizati (None, 112, 112, 64) 256 ['conv1_conv[0][0]']
on)
conv1_relu (Activation) (None, 112, 112, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2 (None, 56, 56, 64) 4160 ['pool1_pool[0][0]']
D)
conv2_block1_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_1_conv[0][0]']
rmalization)
conv2_block1_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_1_bn[0][0]']
ation)
conv2_block1_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block1_1_relu[0][0]']
D)
conv2_block1_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_2_conv[0][0]']
rmalization)
conv2_block1_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_2_bn[0][0]']
ation)
conv2_block1_0_conv (Conv2 (None, 56, 56, 256) 16640 ['pool1_pool[0][0]']
D)
conv2_block1_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block1_2_relu[0][0]']
D)
conv2_block1_0_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_0_conv[0][0]']
rmalization)
conv2_block1_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_3_conv[0][0]']
rmalization)
conv2_block1_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activati (None, 56, 56, 256) 0 ['conv2_block1_add[0][0]']
on)
conv2_block2_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block1_out[0][0]']
D)
conv2_block2_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_1_conv[0][0]']
rmalization)
conv2_block2_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_1_bn[0][0]']
ation)
conv2_block2_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block2_1_relu[0][0]']
D)
conv2_block2_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_2_conv[0][0]']
rmalization)
conv2_block2_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_2_bn[0][0]']
ation)
conv2_block2_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block2_2_relu[0][0]']
D)
conv2_block2_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block2_3_conv[0][0]']
rmalization)
conv2_block2_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activati (None, 56, 56, 256) 0 ['conv2_block2_add[0][0]']
on)
conv2_block3_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block2_out[0][0]']
D)
conv2_block3_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_1_conv[0][0]']
rmalization)
conv2_block3_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_1_bn[0][0]']
ation)
conv2_block3_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block3_1_relu[0][0]']
D)
conv2_block3_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_2_conv[0][0]']
rmalization)
conv2_block3_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_2_bn[0][0]']
ation)
conv2_block3_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block3_2_relu[0][0]']
D)
conv2_block3_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block3_3_conv[0][0]']
rmalization)
conv2_block3_add (Add) (None, 56, 56, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activati (None, 56, 56, 256) 0 ['conv2_block3_add[0][0]']
on)
conv3_block1_1_conv (Conv2 (None, 28, 28, 128) 32896 ['conv2_block3_out[0][0]']
D)
conv3_block1_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_1_conv[0][0]']
rmalization)
conv3_block1_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_1_bn[0][0]']
ation)
conv3_block1_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block1_1_relu[0][0]']
D)
conv3_block1_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_2_conv[0][0]']
rmalization)
conv3_block1_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_2_bn[0][0]']
ation)
conv3_block1_0_conv (Conv2 (None, 28, 28, 512) 131584 ['conv2_block3_out[0][0]']
D)
conv3_block1_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block1_2_relu[0][0]']
D)
conv3_block1_0_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_0_conv[0][0]']
rmalization)
conv3_block1_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_3_conv[0][0]']
rmalization)
conv3_block1_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activati (None, 28, 28, 512) 0 ['conv3_block1_add[0][0]']
on)
conv3_block2_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block1_out[0][0]']
D)
conv3_block2_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_1_conv[0][0]']
rmalization)
conv3_block2_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_1_bn[0][0]']
ation)
conv3_block2_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block2_1_relu[0][0]']
D)
conv3_block2_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_2_conv[0][0]']
rmalization)
conv3_block2_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_2_bn[0][0]']
ation)
conv3_block2_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block2_2_relu[0][0]']
D)
conv3_block2_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block2_3_conv[0][0]']
rmalization)
conv3_block2_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activati (None, 28, 28, 512) 0 ['conv3_block2_add[0][0]']
on)
conv3_block3_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block2_out[0][0]']
D)
conv3_block3_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_1_conv[0][0]']
rmalization)
conv3_block3_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_1_bn[0][0]']
ation)
conv3_block3_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block3_1_relu[0][0]']
D)
conv3_block3_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_2_conv[0][0]']
rmalization)
conv3_block3_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_2_bn[0][0]']
ation)
conv3_block3_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block3_2_relu[0][0]']
D)
conv3_block3_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block3_3_conv[0][0]']
rmalization)
conv3_block3_add (Add) (None, 28, 28, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activati (None, 28, 28, 512) 0 ['conv3_block3_add[0][0]']
on)
conv3_block4_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block3_out[0][0]']
D)
conv3_block4_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_1_conv[0][0]']
rmalization)
conv3_block4_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_1_bn[0][0]']
ation)
conv3_block4_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block4_1_relu[0][0]']
D)
conv3_block4_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_2_conv[0][0]']
rmalization)
conv3_block4_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_2_bn[0][0]']
ation)
conv3_block4_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block4_2_relu[0][0]']
D)
conv3_block4_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block4_3_conv[0][0]']
rmalization)
conv3_block4_add (Add) (None, 28, 28, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activati (None, 28, 28, 512) 0 ['conv3_block4_add[0][0]']
on)
conv4_block1_1_conv (Conv2 (None, 14, 14, 256) 131328 ['conv3_block4_out[0][0]']
D)
conv4_block1_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_1_conv[0][0]']
rmalization)
conv4_block1_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_1_bn[0][0]']
ation)
conv4_block1_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block1_1_relu[0][0]']
D)
conv4_block1_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_2_conv[0][0]']
rmalization)
conv4_block1_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_2_bn[0][0]']
ation)
conv4_block1_0_conv (Conv2 (None, 14, 14, 1024) 525312 ['conv3_block4_out[0][0]']
D)
conv4_block1_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block1_2_relu[0][0]']
D)
conv4_block1_0_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_0_conv[0][0]']
rmalization)
conv4_block1_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_3_conv[0][0]']
rmalization)
conv4_block1_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activati (None, 14, 14, 1024) 0 ['conv4_block1_add[0][0]']
on)
conv4_block2_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block1_out[0][0]']
D)
conv4_block2_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_1_conv[0][0]']
rmalization)
conv4_block2_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_1_bn[0][0]']
ation)
conv4_block2_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block2_1_relu[0][0]']
D)
conv4_block2_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_2_conv[0][0]']
rmalization)
conv4_block2_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_2_bn[0][0]']
ation)
conv4_block2_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block2_2_relu[0][0]']
D)
conv4_block2_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block2_3_conv[0][0]']
rmalization)
conv4_block2_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activati (None, 14, 14, 1024) 0 ['conv4_block2_add[0][0]']
on)
conv4_block3_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block2_out[0][0]']
D)
conv4_block3_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_1_conv[0][0]']
rmalization)
conv4_block3_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_1_bn[0][0]']
ation)
conv4_block3_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block3_1_relu[0][0]']
D)
conv4_block3_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_2_conv[0][0]']
rmalization)
conv4_block3_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_2_bn[0][0]']
ation)
conv4_block3_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block3_2_relu[0][0]']
D)
conv4_block3_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block3_3_conv[0][0]']
rmalization)
conv4_block3_add (Add) (None, 14, 14, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activati (None, 14, 14, 1024) 0 ['conv4_block3_add[0][0]']
on)
conv4_block4_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block3_out[0][0]']
D)
conv4_block4_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_1_conv[0][0]']
rmalization)
conv4_block4_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_1_bn[0][0]']
ation)
conv4_block4_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block4_1_relu[0][0]']
D)
conv4_block4_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_2_conv[0][0]']
rmalization)
conv4_block4_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_2_bn[0][0]']
ation)
conv4_block4_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block4_2_relu[0][0]']
D)
conv4_block4_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block4_3_conv[0][0]']
rmalization)
conv4_block4_add (Add) (None, 14, 14, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activati (None, 14, 14, 1024) 0 ['conv4_block4_add[0][0]']
on)
conv4_block5_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block4_out[0][0]']
D)
conv4_block5_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_1_conv[0][0]']
rmalization)
conv4_block5_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_1_bn[0][0]']
ation)
conv4_block5_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block5_1_relu[0][0]']
D)
conv4_block5_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_2_conv[0][0]']
rmalization)
conv4_block5_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_2_bn[0][0]']
ation)
conv4_block5_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block5_2_relu[0][0]']
D)
conv4_block5_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block5_3_conv[0][0]']
rmalization)
conv4_block5_add (Add) (None, 14, 14, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activati (None, 14, 14, 1024) 0 ['conv4_block5_add[0][0]']
on)
conv4_block6_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block5_out[0][0]']
D)
conv4_block6_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_1_conv[0][0]']
rmalization)
conv4_block6_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_1_bn[0][0]']
ation)
conv4_block6_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block6_1_relu[0][0]']
D)
conv4_block6_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_2_conv[0][0]']
rmalization)
conv4_block6_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_2_bn[0][0]']
ation)
conv4_block6_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block6_2_relu[0][0]']
D)
conv4_block6_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block6_3_conv[0][0]']
rmalization)
conv4_block6_add (Add) (None, 14, 14, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activati (None, 14, 14, 1024) 0 ['conv4_block6_add[0][0]']
on)
conv5_block1_1_conv (Conv2 (None, 7, 7, 512) 524800 ['conv4_block6_out[0][0]']
D)
conv5_block1_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_1_conv[0][0]']
rmalization)
conv5_block1_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_1_bn[0][0]']
ation)
conv5_block1_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block1_1_relu[0][0]']
D)
conv5_block1_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_2_conv[0][0]']
rmalization)
conv5_block1_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_2_bn[0][0]']
ation)
conv5_block1_0_conv (Conv2 (None, 7, 7, 2048) 2099200 ['conv4_block6_out[0][0]']
D)
conv5_block1_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
D)
conv5_block1_0_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_0_conv[0][0]']
rmalization)
conv5_block1_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_3_conv[0][0]']
rmalization)
conv5_block1_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activati (None, 7, 7, 2048) 0 ['conv5_block1_add[0][0]']
on)
conv5_block2_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block1_out[0][0]']
D)
conv5_block2_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_1_conv[0][0]']
rmalization)
conv5_block2_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_1_bn[0][0]']
ation)
conv5_block2_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block2_1_relu[0][0]']
D)
conv5_block2_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_2_conv[0][0]']
rmalization)
conv5_block2_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_2_bn[0][0]']
ation)
conv5_block2_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
D)
conv5_block2_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block2_3_conv[0][0]']
rmalization)
conv5_block2_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activati (None, 7, 7, 2048) 0 ['conv5_block2_add[0][0]']
on)
conv5_block3_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block2_out[0][0]']
D)
conv5_block3_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_1_conv[0][0]']
rmalization)
conv5_block3_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_1_bn[0][0]']
ation)
conv5_block3_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block3_1_relu[0][0]']
D)
conv5_block3_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_2_conv[0][0]']
rmalization)
conv5_block3_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_2_bn[0][0]']
ation)
conv5_block3_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
D)
conv5_block3_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block3_3_conv[0][0]']
rmalization)
conv5_block3_add (Add) (None, 7, 7, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activati (None, 7, 7, 2048) 0 ['conv5_block3_add[0][0]']
on)
global_average_pooling2d_2 (None, 2048) 0 ['conv5_block3_out[0][0]']
(GlobalAveragePooling2D)
dense_7 (Dense) (None, 512) 1049088 ['global_average_pooling2d_2[0
][0]']
dropout_3 (Dropout) (None, 512) 0 ['dense_7[0][0]']
dense_8 (Dense) (None, 196) 100548 ['dropout_3[0][0]']
==================================================================================================
Total params: 24737348 (94.37 MB)
Trainable params: 1149636 (4.39 MB)
Non-trainable params: 23587712 (89.98 MB)
__________________________________________________________________________________________________
epochs = 10
batch_size=16
steps_per_epoch = len(df_train) // batch_size
validation_steps = len(df_val) // batch_size
resnet_history = resnet_model.fit(
train_generator,
steps_per_epoch=steps_per_epoch,
validation_data=val_generator,
validation_steps=validation_steps,
epochs=epochs
)
Epoch 1/10 407/407 [==============================] - 40s 75ms/step - loss: 5.3837 - accuracy: 0.0068 - val_loss: 5.2926 - val_accuracy: 0.0087 Epoch 2/10 407/407 [==============================] - 26s 64ms/step - loss: 5.2883 - accuracy: 0.0060 - val_loss: 5.2853 - val_accuracy: 0.0074 Epoch 3/10 407/407 [==============================] - 26s 64ms/step - loss: 5.2770 - accuracy: 0.0078 - val_loss: 5.2850 - val_accuracy: 0.0099 Epoch 4/10 407/407 [==============================] - 25s 62ms/step - loss: 5.2739 - accuracy: 0.0075 - val_loss: 5.2816 - val_accuracy: 0.0019 Epoch 5/10 407/407 [==============================] - 25s 60ms/step - loss: 5.2711 - accuracy: 0.0069 - val_loss: 5.2787 - val_accuracy: 0.0025 Epoch 6/10 407/407 [==============================] - 23s 57ms/step - loss: 5.2674 - accuracy: 0.0075 - val_loss: 5.2804 - val_accuracy: 0.0025 Epoch 7/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2624 - accuracy: 0.0100 - val_loss: 5.2796 - val_accuracy: 0.0050 Epoch 8/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2574 - accuracy: 0.0091 - val_loss: 5.2800 - val_accuracy: 0.0068 Epoch 9/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2519 - accuracy: 0.0097 - val_loss: 5.2781 - val_accuracy: 0.0087 Epoch 10/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2459 - accuracy: 0.0089 - val_loss: 5.2707 - val_accuracy: 0.0105
#accuracy loss graph
plot_training_history(resnet_history)
y_pred, y_true,df_resnet_classification_report = generate_classification_report_tf_model(
model=resnet_model,
df_val=df_val,
label_encoder=label_encoder,
preprocess_fn=resnet_preprocess,
batch_size=32,
report_name="resnet_classification_report.csv"
)
51/51 [==============================] - 7s 44ms/step
Model Accuracy: 0.0104
Classification Report:
Report saved as: resnet_classification_report.csv
Model Accuracy: 0.0104
Average Summary Metrics:
precision recall f1-score
macro avg 0.989922 0.007070 0.000247
weighted avg 0.985440 0.010436 0.000340
overall_accuracy 0.010436 NaN NaN
# Compute confusion matrix
df_support = df_resnet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
resnet_cm = confusion_matrix(y_val_true, y_val_pred)
resnet_cm_top10 = resnet_cm[np.ix_(top_10_indices, top_10_indices)]
# Plot confusion matrix
plt.figure(figsize=(10, 8))
sns.heatmap(resnet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
#sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=label_encoder.classes_, yticklabels=label_encoder.classes_)
plt.title("ResNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()
Observation:
The model is not doing well due to
Further Actions could be
Googlenet and Resnet further in next milestone will undergo hyper parameter tunning as commonly data imbalance and accuracy is less compared to loss
Mobilenet and Alexnet are light weight models/Shallow models, hence they are being dropped from further fine tuning and comparing them with other models.
from tensorflow.keras.mixed_precision import set_global_policy
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint,ReduceLROnPlateau
import random
set_global_policy('mixed_float16')
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK Your GPU will likely run quickly with dtype policy mixed_float16 as it has compute capability of at least 7.0. Your GPU: NVIDIA A10G, compute capability 8.6
print("TF Version:", tf.__version__)
print("GPU Available:", tf.config.list_physical_devices('GPU'))
TF Version: 2.16.2 GPU Available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
unique_classes = df_training['labels'].unique()
base_model = InceptionV3(
weights='imagenet',
include_top=False,
input_shape=(224, 224, 3)
)
for layer in base_model.layers[:-50]:
layer.trainable = False
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(512, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
output = layers.Dense(196, activation='softmax', dtype='float32')(x) # Force output to float32
googlenet_model_tuned = Model(inputs=base_model.input, outputs=output)
googlenet_model_tuned.compile(
optimizer=Adam(learning_rate=1e-3),
loss='sparse_categorical_crossentropy', # use categorical_crossentropy if labels are one-hot
metrics=['accuracy', tf.keras.metrics.TopKCategoricalAccuracy(k=5)]
)
batch_size = 16
use_augmentation = True
df_split = df_training.drop(columns=['image']).copy()
df_train_googlenet, df_val_googlenet = train_test_split( df_split, test_size=0.2, random_state=42)
train_paths = df_train_googlenet["Image_Path"].values
train_labels = np.array([np.argmax(label) for label in df_train_googlenet["label_categorical"]])
val_paths = df_val_googlenet["Image_Path"].values
val_labels = np.array([np.argmax(label) for label in df_val_googlenet["label_categorical"]])
data_augmentation = tf.keras.Sequential([
layers.Rescaling(1./255),
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
layers.RandomContrast(0.1),
layers.RandomTranslation(0.1, 0.1)
])
def load_and_preprocess(path, label):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [224, 224])
image = tf.cast(image, tf.float32) / 255.0
return image, label
def load_preprocess_with_augment(path, label):
image, label = load_and_preprocess(path, label)
image = data_augmentation(image)
return image, label
# Training dataset
train_ds = tf.data.Dataset.from_tensor_slices((train_paths, train_labels)).shuffle(1000)
# Apply map function based on flag
if use_augmentation:
train_ds = train_ds.map(load_preprocess_with_augment, num_parallel_calls=tf.data.AUTOTUNE)
else:
train_ds = train_ds.map(load_and_preprocess, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_ds = tf.data.Dataset.from_tensor_slices((val_paths, val_labels)) \
.map(load_and_preprocess, num_parallel_calls=tf.data.AUTOTUNE) \
.batch(batch_size) \
.prefetch(tf.data.AUTOTUNE)
callbacks = [
EarlyStopping(
monitor='val_loss',
patience=30,
restore_best_weights=True,
verbose=1
),
ModelCheckpoint(
filepath='googlenet_finetuned_best.keras',
monitor='val_loss',
save_best_only=True,
verbose=1
),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-7, verbose=1)
]
train_class_indices = np.array([np.argmax(label) for label in df_train_googlenet["label_categorical"]]) #Get Class Indicies
# Compute class weights
class_weights_array = compute_class_weight(
class_weight='balanced',
classes=np.unique(train_class_indices),
y=train_class_indices
)
class_weights = dict(enumerate(class_weights_array)) #converting to dict
history = googlenet_model_tuned.fit(
train_ds,
validation_data=val_ds,
epochs=20
,callbacks=callbacks
,class_weight=class_weights #class imbalance
)
Epoch 1/20 WARNING:tensorflow:AutoGraph could not transform <function create_autocast_variable at 0x7f7889cc9bd0> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: <gast.gast.Expr object at 0x7f7739b9f8e0> To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert WARNING: AutoGraph could not transform <function create_autocast_variable at 0x7f7889cc9bd0> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: <gast.gast.Expr object at 0x7f7739b9f8e0> To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert 408/408 [==============================] - ETA: 0s - loss: 5.2980 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0048 Epoch 1: val_loss improved from inf to 5.27864, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 59s 59ms/step - loss: 5.2980 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0048 - val_loss: 5.2786 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0074 - lr: 0.0010 Epoch 2/20 408/408 [==============================] - ETA: 0s - loss: 5.2797 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 Epoch 2: val_loss improved from 5.27864 to 5.27856, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2797 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2786 - val_accuracy: 0.0055 - val_top_k_categorical_accuracy: 0.0080 - lr: 0.0010 Epoch 3/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 3: val_loss improved from 5.27856 to 5.27852, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2785 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0074 - lr: 0.0010 Epoch 4/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0045 - top_k_categorical_accuracy: 0.0000e+00 Epoch 4: val_loss improved from 5.27852 to 5.27835, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0045 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2784 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0061 - lr: 0.0010 Epoch 5/20 405/408 [============================>.] - ETA: 0s - loss: 5.2805 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 5: val_loss improved from 5.27835 to 5.27826, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2783 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0061 - lr: 0.0010 Epoch 6/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0026 - top_k_categorical_accuracy: 0.0000e+00 Epoch 6: val_loss did not improve from 5.27826 408/408 [==============================] - 13s 32ms/step - loss: 5.2798 - accuracy: 0.0026 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2783 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0061 - lr: 0.0010 Epoch 7/20 408/408 [==============================] - ETA: 0s - loss: 5.2797 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.0000e+00 Epoch 7: val_loss improved from 5.27826 to 5.27826, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2797 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2783 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0049 - lr: 0.0010 Epoch 8/20 406/408 [============================>.] - ETA: 0s - loss: 5.2795 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.2660 Epoch 8: val_loss improved from 5.27826 to 5.27818, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2798 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.2682 - val_loss: 5.2782 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0123 - lr: 0.0010 Epoch 9/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.4744 Epoch 9: val_loss did not improve from 5.27818 408/408 [==============================] - 13s 32ms/step - loss: 5.2798 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.4744 - val_loss: 5.2783 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0123 - lr: 0.0010 Epoch 10/20 406/408 [============================>.] - ETA: 0s - loss: 5.2788 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0468 Epoch 10: val_loss improved from 5.27818 to 5.27813, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0467 - val_loss: 5.2781 - val_accuracy: 0.0031 - val_top_k_categorical_accuracy: 0.0104 - lr: 0.0010 Epoch 11/20 406/408 [============================>.] - ETA: 0s - loss: 5.2796 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.2094 Epoch 11: val_loss improved from 5.27813 to 5.27806, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2798 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.2087 - val_loss: 5.2781 - val_accuracy: 0.0031 - val_top_k_categorical_accuracy: 0.0092 - lr: 0.0010 Epoch 12/20 408/408 [==============================] - ETA: 0s - loss: 5.2797 - accuracy: 0.0040 - top_k_categorical_accuracy: 0.3659 Epoch 12: val_loss did not improve from 5.27806 408/408 [==============================] - 13s 32ms/step - loss: 5.2797 - accuracy: 0.0040 - top_k_categorical_accuracy: 0.3659 - val_loss: 5.2781 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0098 - lr: 0.0010 Epoch 13/20 406/408 [============================>.] - ETA: 0s - loss: 5.2792 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 Epoch 13: val_loss improved from 5.27806 to 5.27805, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2798 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2781 - val_accuracy: 0.0031 - val_top_k_categorical_accuracy: 0.0086 - lr: 0.0010 Epoch 14/20 407/408 [============================>.] - ETA: 0s - loss: 5.2800 - accuracy: 0.0048 - top_k_categorical_accuracy: 0.1032 Epoch 14: val_loss did not improve from 5.27805 408/408 [==============================] - 13s 32ms/step - loss: 5.2798 - accuracy: 0.0048 - top_k_categorical_accuracy: 0.1031 - val_loss: 5.2781 - val_accuracy: 0.0031 - val_top_k_categorical_accuracy: 0.0086 - lr: 0.0010 Epoch 15/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 Epoch 15: val_loss improved from 5.27805 to 5.27799, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0074 - lr: 0.0010 Epoch 16/20 406/408 [============================>.] - ETA: 0s - loss: 5.2789 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 Epoch 16: val_loss did not improve from 5.27799 Epoch 16: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257. 408/408 [==============================] - 13s 32ms/step - loss: 5.2797 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0068 - lr: 0.0010 Epoch 17/20 406/408 [============================>.] - ETA: 0s - loss: 5.2794 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 17: val_loss did not improve from 5.27799 408/408 [==============================] - 13s 33ms/step - loss: 5.2790 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0055 - lr: 5.0000e-04 Epoch 18/20 407/408 [============================>.] - ETA: 0s - loss: 5.2788 - accuracy: 0.0043 - top_k_categorical_accuracy: 0.0000e+00 Epoch 18: val_loss did not improve from 5.27799 408/408 [==============================] - 13s 32ms/step - loss: 5.2790 - accuracy: 0.0043 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0061 - lr: 5.0000e-04 Epoch 19/20 408/408 [==============================] - ETA: 0s - loss: 5.2790 - accuracy: 0.0043 - top_k_categorical_accuracy: 0.0000e+00 Epoch 19: val_loss did not improve from 5.27799 408/408 [==============================] - 13s 32ms/step - loss: 5.2790 - accuracy: 0.0043 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0061 - lr: 5.0000e-04 Epoch 20/20 405/408 [============================>.] - ETA: 0s - loss: 5.2787 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0000e+00 Epoch 20: val_loss improved from 5.27799 to 5.27799, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 14s 35ms/step - loss: 5.2790 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2780 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0061 - lr: 5.0000e-04 Restoring model weights from the end of the best epoch: 20.
Train Val Loss Graph
plot_training_history(history)
# Get all predictions and true labels
y_true = []
y_pred = []
for X_batch, y_batch in val_ds:
preds = googlenet_model_tuned.predict(X_batch)
y_pred_batch = np.argmax(preds, axis=1)
y_true_batch = y_batch.numpy() if hasattr(y_batch, "numpy") else y_batch
y_true.extend(y_true_batch)
y_pred.extend(y_pred_batch)
1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 162ms/step 1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 29ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
report = classification_report(
y_true, y_pred,
target_names=target_names,
output_dict=True,
zero_division=1
)
df_googlenet_tuned_report = pd.DataFrame(report).transpose()
acc = accuracy_score(y_true, y_pred)
df_googlenet_tuned_report.loc["overall_accuracy"] = [acc, None, None, None]
df_googlenet_tuned_report.to_csv("googlenet_tuned_classification_report.csv")
print(f"Tuned GoogLeNet Accuracy: {acc:.4f}")
print("Average Summary Metrics:")
print(df_googlenet_tuned_report.tail(3)[["precision", "recall", "f1-score"]])
Tuned GoogLeNet Accuracy: 0.0037
Average Summary Metrics:
precision recall f1-score
macro avg 0.638079 0.002648 0.000545
weighted avg 0.628435 0.003683 0.000747
overall_accuracy 0.003683 NaN NaN
confusion matrix for tuned
cm = confusion_matrix(y_true, y_pred)
df_support = df_googlenet_tuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
# Get class indices (map from class name to index)
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
# Plot
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Tuned GoogLeNet - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
for Classimbalance
#encoding labels
label_encoder = LabelEncoder()
df_training['labels_encoded'] = label_encoder.fit_transform(df_training['labels'])
df_training['labels'] = df_training['labels'].astype(str)
df_val['labels'] = df_val['labels'].astype(str)
# Calculate class weights (important for class imbalance)
class_weights = class_weight.compute_class_weight(
'balanced',
classes=np.unique(df_training['labels_encoded']),
y=df_training['labels_encoded']
)
class_weights = dict(enumerate(class_weights)) # Convert to dictionary
batch_size = 32
image_size = (224, 224)
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
#,preprocessing_function=resnet_preprocess
)
train_generator = train_datagen.flow_from_dataframe(
df_training,
x_col='Image_Path',
y_col='labels',
target_size=image_size,
batch_size=batch_size,
class_mode='categorical'
)
Found 8144 validated image filenames belonging to 196 classes.
#val_datagen = ImageDataGenerator(preprocessing_function=resnet_preprocess)
val_datagen = ImageDataGenerator()
val_generator = val_datagen.flow_from_dataframe(
df_val,
x_col='Image_Path',
y_col='labels',
target_size=image_size,
batch_size=batch_size,
class_mode='categorical'
)
Found 1629 validated image filenames belonging to 196 classes.
model definition
# Load ResNet50 base model without the top layer
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Unfreeze some layers of the base model for fine-tuning (last 40 layers)
for layer in base_model.layers:
layer.trainable = False
for layer in base_model.layers[-40:]:
layer.trainable = True
num_classes = df_train['labels'].nunique()
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(num_classes, activation='softmax')(x) # Output layer
# Define model
resnet_tuned_model = Model(inputs=base_model.input, outputs=x)
# Compile the model again with a higher learning rate
resnet_tuned_model.compile(optimizer=Adam(learning_rate=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
resnet_tuned_model.summary()
Model: "model_4"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 224, 224, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 ['input_5[0][0]']
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalizati (None, 112, 112, 64) 256 ['conv1_conv[0][0]']
on)
conv1_relu (Activation) (None, 112, 112, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2 (None, 56, 56, 64) 4160 ['pool1_pool[0][0]']
D)
conv2_block1_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_1_conv[0][0]']
rmalization)
conv2_block1_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_1_bn[0][0]']
ation)
conv2_block1_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block1_1_relu[0][0]']
D)
conv2_block1_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_2_conv[0][0]']
rmalization)
conv2_block1_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_2_bn[0][0]']
ation)
conv2_block1_0_conv (Conv2 (None, 56, 56, 256) 16640 ['pool1_pool[0][0]']
D)
conv2_block1_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block1_2_relu[0][0]']
D)
conv2_block1_0_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_0_conv[0][0]']
rmalization)
conv2_block1_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_3_conv[0][0]']
rmalization)
conv2_block1_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activati (None, 56, 56, 256) 0 ['conv2_block1_add[0][0]']
on)
conv2_block2_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block1_out[0][0]']
D)
conv2_block2_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_1_conv[0][0]']
rmalization)
conv2_block2_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_1_bn[0][0]']
ation)
conv2_block2_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block2_1_relu[0][0]']
D)
conv2_block2_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_2_conv[0][0]']
rmalization)
conv2_block2_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_2_bn[0][0]']
ation)
conv2_block2_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block2_2_relu[0][0]']
D)
conv2_block2_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block2_3_conv[0][0]']
rmalization)
conv2_block2_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activati (None, 56, 56, 256) 0 ['conv2_block2_add[0][0]']
on)
conv2_block3_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block2_out[0][0]']
D)
conv2_block3_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_1_conv[0][0]']
rmalization)
conv2_block3_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_1_bn[0][0]']
ation)
conv2_block3_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block3_1_relu[0][0]']
D)
conv2_block3_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_2_conv[0][0]']
rmalization)
conv2_block3_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_2_bn[0][0]']
ation)
conv2_block3_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block3_2_relu[0][0]']
D)
conv2_block3_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block3_3_conv[0][0]']
rmalization)
conv2_block3_add (Add) (None, 56, 56, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activati (None, 56, 56, 256) 0 ['conv2_block3_add[0][0]']
on)
conv3_block1_1_conv (Conv2 (None, 28, 28, 128) 32896 ['conv2_block3_out[0][0]']
D)
conv3_block1_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_1_conv[0][0]']
rmalization)
conv3_block1_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_1_bn[0][0]']
ation)
conv3_block1_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block1_1_relu[0][0]']
D)
conv3_block1_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_2_conv[0][0]']
rmalization)
conv3_block1_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_2_bn[0][0]']
ation)
conv3_block1_0_conv (Conv2 (None, 28, 28, 512) 131584 ['conv2_block3_out[0][0]']
D)
conv3_block1_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block1_2_relu[0][0]']
D)
conv3_block1_0_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_0_conv[0][0]']
rmalization)
conv3_block1_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_3_conv[0][0]']
rmalization)
conv3_block1_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activati (None, 28, 28, 512) 0 ['conv3_block1_add[0][0]']
on)
conv3_block2_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block1_out[0][0]']
D)
conv3_block2_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_1_conv[0][0]']
rmalization)
conv3_block2_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_1_bn[0][0]']
ation)
conv3_block2_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block2_1_relu[0][0]']
D)
conv3_block2_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_2_conv[0][0]']
rmalization)
conv3_block2_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_2_bn[0][0]']
ation)
conv3_block2_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block2_2_relu[0][0]']
D)
conv3_block2_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block2_3_conv[0][0]']
rmalization)
conv3_block2_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activati (None, 28, 28, 512) 0 ['conv3_block2_add[0][0]']
on)
conv3_block3_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block2_out[0][0]']
D)
conv3_block3_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_1_conv[0][0]']
rmalization)
conv3_block3_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_1_bn[0][0]']
ation)
conv3_block3_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block3_1_relu[0][0]']
D)
conv3_block3_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_2_conv[0][0]']
rmalization)
conv3_block3_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_2_bn[0][0]']
ation)
conv3_block3_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block3_2_relu[0][0]']
D)
conv3_block3_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block3_3_conv[0][0]']
rmalization)
conv3_block3_add (Add) (None, 28, 28, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activati (None, 28, 28, 512) 0 ['conv3_block3_add[0][0]']
on)
conv3_block4_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block3_out[0][0]']
D)
conv3_block4_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_1_conv[0][0]']
rmalization)
conv3_block4_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_1_bn[0][0]']
ation)
conv3_block4_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block4_1_relu[0][0]']
D)
conv3_block4_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_2_conv[0][0]']
rmalization)
conv3_block4_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_2_bn[0][0]']
ation)
conv3_block4_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block4_2_relu[0][0]']
D)
conv3_block4_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block4_3_conv[0][0]']
rmalization)
conv3_block4_add (Add) (None, 28, 28, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activati (None, 28, 28, 512) 0 ['conv3_block4_add[0][0]']
on)
conv4_block1_1_conv (Conv2 (None, 14, 14, 256) 131328 ['conv3_block4_out[0][0]']
D)
conv4_block1_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_1_conv[0][0]']
rmalization)
conv4_block1_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_1_bn[0][0]']
ation)
conv4_block1_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block1_1_relu[0][0]']
D)
conv4_block1_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_2_conv[0][0]']
rmalization)
conv4_block1_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_2_bn[0][0]']
ation)
conv4_block1_0_conv (Conv2 (None, 14, 14, 1024) 525312 ['conv3_block4_out[0][0]']
D)
conv4_block1_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block1_2_relu[0][0]']
D)
conv4_block1_0_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_0_conv[0][0]']
rmalization)
conv4_block1_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_3_conv[0][0]']
rmalization)
conv4_block1_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activati (None, 14, 14, 1024) 0 ['conv4_block1_add[0][0]']
on)
conv4_block2_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block1_out[0][0]']
D)
conv4_block2_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_1_conv[0][0]']
rmalization)
conv4_block2_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_1_bn[0][0]']
ation)
conv4_block2_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block2_1_relu[0][0]']
D)
conv4_block2_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_2_conv[0][0]']
rmalization)
conv4_block2_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_2_bn[0][0]']
ation)
conv4_block2_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block2_2_relu[0][0]']
D)
conv4_block2_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block2_3_conv[0][0]']
rmalization)
conv4_block2_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activati (None, 14, 14, 1024) 0 ['conv4_block2_add[0][0]']
on)
conv4_block3_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block2_out[0][0]']
D)
conv4_block3_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_1_conv[0][0]']
rmalization)
conv4_block3_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_1_bn[0][0]']
ation)
conv4_block3_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block3_1_relu[0][0]']
D)
conv4_block3_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_2_conv[0][0]']
rmalization)
conv4_block3_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_2_bn[0][0]']
ation)
conv4_block3_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block3_2_relu[0][0]']
D)
conv4_block3_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block3_3_conv[0][0]']
rmalization)
conv4_block3_add (Add) (None, 14, 14, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activati (None, 14, 14, 1024) 0 ['conv4_block3_add[0][0]']
on)
conv4_block4_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block3_out[0][0]']
D)
conv4_block4_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_1_conv[0][0]']
rmalization)
conv4_block4_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_1_bn[0][0]']
ation)
conv4_block4_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block4_1_relu[0][0]']
D)
conv4_block4_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_2_conv[0][0]']
rmalization)
conv4_block4_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_2_bn[0][0]']
ation)
conv4_block4_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block4_2_relu[0][0]']
D)
conv4_block4_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block4_3_conv[0][0]']
rmalization)
conv4_block4_add (Add) (None, 14, 14, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activati (None, 14, 14, 1024) 0 ['conv4_block4_add[0][0]']
on)
conv4_block5_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block4_out[0][0]']
D)
conv4_block5_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_1_conv[0][0]']
rmalization)
conv4_block5_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_1_bn[0][0]']
ation)
conv4_block5_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block5_1_relu[0][0]']
D)
conv4_block5_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_2_conv[0][0]']
rmalization)
conv4_block5_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_2_bn[0][0]']
ation)
conv4_block5_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block5_2_relu[0][0]']
D)
conv4_block5_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block5_3_conv[0][0]']
rmalization)
conv4_block5_add (Add) (None, 14, 14, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activati (None, 14, 14, 1024) 0 ['conv4_block5_add[0][0]']
on)
conv4_block6_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block5_out[0][0]']
D)
conv4_block6_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_1_conv[0][0]']
rmalization)
conv4_block6_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_1_bn[0][0]']
ation)
conv4_block6_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block6_1_relu[0][0]']
D)
conv4_block6_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_2_conv[0][0]']
rmalization)
conv4_block6_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_2_bn[0][0]']
ation)
conv4_block6_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block6_2_relu[0][0]']
D)
conv4_block6_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block6_3_conv[0][0]']
rmalization)
conv4_block6_add (Add) (None, 14, 14, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activati (None, 14, 14, 1024) 0 ['conv4_block6_add[0][0]']
on)
conv5_block1_1_conv (Conv2 (None, 7, 7, 512) 524800 ['conv4_block6_out[0][0]']
D)
conv5_block1_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_1_conv[0][0]']
rmalization)
conv5_block1_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_1_bn[0][0]']
ation)
conv5_block1_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block1_1_relu[0][0]']
D)
conv5_block1_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_2_conv[0][0]']
rmalization)
conv5_block1_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_2_bn[0][0]']
ation)
conv5_block1_0_conv (Conv2 (None, 7, 7, 2048) 2099200 ['conv4_block6_out[0][0]']
D)
conv5_block1_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
D)
conv5_block1_0_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_0_conv[0][0]']
rmalization)
conv5_block1_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_3_conv[0][0]']
rmalization)
conv5_block1_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activati (None, 7, 7, 2048) 0 ['conv5_block1_add[0][0]']
on)
conv5_block2_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block1_out[0][0]']
D)
conv5_block2_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_1_conv[0][0]']
rmalization)
conv5_block2_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_1_bn[0][0]']
ation)
conv5_block2_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block2_1_relu[0][0]']
D)
conv5_block2_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_2_conv[0][0]']
rmalization)
conv5_block2_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_2_bn[0][0]']
ation)
conv5_block2_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
D)
conv5_block2_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block2_3_conv[0][0]']
rmalization)
conv5_block2_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activati (None, 7, 7, 2048) 0 ['conv5_block2_add[0][0]']
on)
conv5_block3_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block2_out[0][0]']
D)
conv5_block3_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_1_conv[0][0]']
rmalization)
conv5_block3_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_1_bn[0][0]']
ation)
conv5_block3_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block3_1_relu[0][0]']
D)
conv5_block3_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_2_conv[0][0]']
rmalization)
conv5_block3_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_2_bn[0][0]']
ation)
conv5_block3_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
D)
conv5_block3_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block3_3_conv[0][0]']
rmalization)
conv5_block3_add (Add) (None, 7, 7, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activati (None, 7, 7, 2048) 0 ['conv5_block3_add[0][0]']
on)
global_average_pooling2d_4 (None, 2048) 0 ['conv5_block3_out[0][0]']
(GlobalAveragePooling2D)
dense_11 (Dense) (None, 512) 1049088 ['global_average_pooling2d_4[0
][0]']
dropout_5 (Dropout) (None, 512) 0 ['dense_11[0][0]']
dense_12 (Dense) (None, 196) 100548 ['dropout_5[0][0]']
==================================================================================================
Total params: 24737348 (94.37 MB)
Trainable params: 16981444 (64.78 MB)
Non-trainable params: 7755904 (29.59 MB)
__________________________________________________________________________________________________
Data Generation
# Custom data generator
""" def custom_data_generator(generator, class_weights):
while True:
x, y = next(generator)
sample_weights = np.array([class_weights[np.argmax(label)] for label in y])
yield x, y, sample_weights """
def custom_data_generator(generator, class_weights):
while True:
x, y = next(generator)
sample_weights = np.array([class_weights[label] for label in np.argmax(y, axis=1)])
yield x, y, sample_weights
# Define data augmentation
#train_datagen = ImageDataGenerator(
# rotation_range=20,
# width_shift_range=0.2,
# height_shift_range=0.2,
# shear_range=0.2,
# zoom_range=0.2,
# horizontal_flip=True,
# fill_mode='nearest',
# preprocessing_function=preprocess_input # Use the ResNet50 preprocessing function
#)
# Load data using flow_from_dataframe
#print(df_train.columns) # Check the columns in the DataFrame
# Convert labels to string format if they are not already
#df_train['labels'] = df_train['labels'].astype(str)
# Create training data generator
#train_generator = train_datagen.flow_from_dataframe(
# df_train,
# x_col='Image_Path',
# y_col='labels', # Now it should be in the correct format
# target_size=(224, 224),
# batch_size=batch_size,
# class_mode='categorical'
#)
# Create validation data generator (assuming df_val is defined)
#val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input) # No augmentation for validation
#val_generator = val_datagen.flow_from_dataframe(
# df_val,
# x_col='Image_Path',
# y_col='labels',
# target_size=(224, 224),
# batch_size=batch_size,
# class_mode='categorical'
#)
# Train the model for a few epochs
#steps_per_epoch = len(df_training) // batch_size
#validation_steps = len(df_val) // batch_size
steps_per_epoch = np.ceil(len(df_training) / batch_size).astype(int)
validation_steps = np.ceil(len(df_val) / batch_size).astype(int)
# Define number of fine-tuning epochs
fine_tune_epochs = 20
# Define callbacks
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
model_checkpoint = ModelCheckpoint('best_model_tuned_resnet.keras', save_best_only=True, monitor='val_loss')
# Train the model with callbacks and custom data generator
#resnet_history_fine_tune = resnet_model.fit(
# custom_data_generator(train_generator, class_weights),
# steps_per_epoch=steps_per_epoch,
# validation_data=val_generator,
# validation_steps=validation_steps,
# epochs=fine_tune_epochs,
# callbacks=[early_stopping, model_checkpoint]
#)
resnet_history_fine_tune = resnet_tuned_model.fit(
train_generator,
steps_per_epoch=steps_per_epoch,
validation_data=val_generator,
validation_steps=validation_steps,
epochs=20,
callbacks=[early_stopping, model_checkpoint]
,class_weight=class_weights
)
Epoch 1/20 255/255 [==============================] - 104s 348ms/step - loss: 5.5599 - accuracy: 0.0054 - val_loss: 5.2763 - val_accuracy: 0.0055 Epoch 2/20 255/255 [==============================] - 86s 339ms/step - loss: 5.3193 - accuracy: 0.0077 - val_loss: 5.1636 - val_accuracy: 0.0135 Epoch 3/20 255/255 [==============================] - 87s 339ms/step - loss: 5.2150 - accuracy: 0.0138 - val_loss: 5.0478 - val_accuracy: 0.0344 Epoch 4/20 255/255 [==============================] - 86s 337ms/step - loss: 5.1253 - accuracy: 0.0249 - val_loss: 4.9025 - val_accuracy: 0.0552 Epoch 5/20 255/255 [==============================] - 86s 339ms/step - loss: 5.0139 - accuracy: 0.0332 - val_loss: 4.7125 - val_accuracy: 0.0896 Epoch 6/20 255/255 [==============================] - 87s 339ms/step - loss: 4.8937 - accuracy: 0.0438 - val_loss: 4.4970 - val_accuracy: 0.1295 Epoch 7/20 255/255 [==============================] - 86s 339ms/step - loss: 4.7345 - accuracy: 0.0667 - val_loss: 4.2544 - val_accuracy: 0.1878 Epoch 8/20 255/255 [==============================] - 87s 340ms/step - loss: 4.5798 - accuracy: 0.0829 - val_loss: 4.0057 - val_accuracy: 0.2400 Epoch 9/20 255/255 [==============================] - 86s 338ms/step - loss: 4.3906 - accuracy: 0.1106 - val_loss: 3.7407 - val_accuracy: 0.2775 Epoch 10/20 255/255 [==============================] - 86s 338ms/step - loss: 4.2036 - accuracy: 0.1342 - val_loss: 3.4715 - val_accuracy: 0.3217 Epoch 11/20 255/255 [==============================] - 87s 339ms/step - loss: 4.0151 - accuracy: 0.1580 - val_loss: 3.2367 - val_accuracy: 0.3671 Epoch 12/20 255/255 [==============================] - 86s 338ms/step - loss: 3.7990 - accuracy: 0.1933 - val_loss: 3.0037 - val_accuracy: 0.4033 Epoch 13/20 255/255 [==============================] - 86s 339ms/step - loss: 3.6676 - accuracy: 0.2101 - val_loss: 2.8150 - val_accuracy: 0.4432 Epoch 14/20 255/255 [==============================] - 87s 340ms/step - loss: 3.4788 - accuracy: 0.2410 - val_loss: 2.5960 - val_accuracy: 0.4948 Epoch 15/20 255/255 [==============================] - 86s 338ms/step - loss: 3.3210 - accuracy: 0.2653 - val_loss: 2.4155 - val_accuracy: 0.5285 Epoch 16/20 255/255 [==============================] - 87s 340ms/step - loss: 3.1629 - accuracy: 0.2942 - val_loss: 2.2691 - val_accuracy: 0.5654 Epoch 17/20 255/255 [==============================] - 86s 337ms/step - loss: 3.0306 - accuracy: 0.3116 - val_loss: 2.1205 - val_accuracy: 0.5838 Epoch 18/20 255/255 [==============================] - 86s 336ms/step - loss: 2.8561 - accuracy: 0.3466 - val_loss: 1.9307 - val_accuracy: 0.6120 Epoch 19/20 255/255 [==============================] - 86s 337ms/step - loss: 2.7541 - accuracy: 0.3642 - val_loss: 1.7872 - val_accuracy: 0.6568 Epoch 20/20 255/255 [==============================] - 86s 339ms/step - loss: 2.5899 - accuracy: 0.4015 - val_loss: 1.6911 - val_accuracy: 0.6777
plot_training_history(resnet_history_fine_tune)
num_samples = len(df_val)
X_val = np.array(df_val['image'].tolist()).astype(np.float32) # Keep as list to avoid memory burst
y_val_true = np.array([np.argmax(label) for label in df_val['label_categorical']])
y_val_pred = []
for i in range(0, num_samples, batch_size):
batch_imgs = X_val[i:i+batch_size]
preds = resnet_tuned_model.predict(batch_imgs, verbose=0)
batch_preds = np.argmax(preds, axis=1)
y_val_pred.extend(batch_preds)
y_val_pred = np.array(y_val_pred)
print("X_val shape:", X_val.shape)
print("X_val dtype:", X_val.dtype)
X_val shape: (1629, 224, 224, 3) X_val dtype: float32
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
resnet_tuned_report = classification_report(
y_val_true, y_val_pred,
target_names=target_names,
output_dict=True,
zero_division=1 # Avoid divide-by-zero errors
)
#print("Unique y_true:", np.unique(y_val_true))
#print("Unique y_pred:", np.unique(y_val_pred))
#unique_preds, counts = np.unique(y_val_pred, return_counts=True)
#print("Predicted class distribution:", dict(zip(unique_preds, counts)))
df_resnet_tuned_report = pd.DataFrame(resnet_tuned_report).transpose()
acc = accuracy_score(y_val_true, y_val_pred)
df_resnet_tuned_report.loc["overall_accuracy"] = [acc, None, None, None]
df_resnet_tuned_report.to_csv("resnet_tuned_classification_report.csv")
print(f"Tuned ResNet Accuracy: {acc:.4f}")
print("Average Resnet Summary Metrics:")
print(df_resnet_tuned_report.tail(3)[["precision", "recall", "f1-score"]])
Tuned ResNet Accuracy: 0.0037
Average Resnet Summary Metrics:
precision recall f1-score
macro avg 0.974528 0.005782 0.000076
weighted avg 0.975470 0.003683 0.000050
overall_accuracy 0.003683 NaN NaN
cm = confusion_matrix(y_true, y_pred)
df_support = df_resnet_tuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Resnet Tuned - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
# First, add a column to identify each model
df_resnet_classification_report_tail = df_resnet_classification_report.tail(4).copy()
df_resnet_classification_report_tail['Model'] = 'ResNet Untuned (10 Epochs)'
df_resnet_tuned_report_tail = df_resnet_tuned_report.tail(4).copy()
df_resnet_tuned_report_tail['Model'] = 'ResNet Tuned (20 Epochs)'
df_googlenet_classification_report_tail = df_googlenet_classification_report.tail(4).copy()
df_googlenet_classification_report_tail['Model'] = 'GoogLeNet Untuned (10 Epochs)'
df_googlenet_tuned_report_tail = df_googlenet_tuned_report.tail(4).copy()
df_googlenet_tuned_report_tail['Model'] = 'GoogLeNet Tuned (20 Epochs)'
df_combined_tail = pd.concat([
df_resnet_classification_report_tail,
df_resnet_tuned_report_tail,
df_googlenet_classification_report_tail,
df_googlenet_tuned_report_tail
])
df_combined_tail = df_combined_tail.reset_index().rename(columns={'index': 'Metric'})
df_combined_tail = df_combined_tail[['Model', 'Metric', 'precision', 'recall', 'f1-score']]
df_combined_tail.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 16 entries, 0 to 15 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Model 16 non-null object 1 Metric 16 non-null object 2 precision 16 non-null float64 3 recall 12 non-null float64 4 f1-score 12 non-null float64 dtypes: float64(3), object(2) memory usage: 768.0+ bytes
df_combined_tail
| Model | Metric | precision | recall | f1-score | |
|---|---|---|---|---|---|
| 0 | ResNet Untuned (10 Epochs) | accuracy | 0.010436 | 0.010436 | 0.010436 |
| 1 | ResNet Untuned (10 Epochs) | macro avg | 0.989922 | 0.007070 | 0.000247 |
| 2 | ResNet Untuned (10 Epochs) | weighted avg | 0.985440 | 0.010436 | 0.000340 |
| 3 | ResNet Untuned (10 Epochs) | overall_accuracy | 0.010436 | NaN | NaN |
| 4 | ResNet Tuned (20 Epochs) | accuracy | 0.003683 | 0.003683 | 0.003683 |
| 5 | ResNet Tuned (20 Epochs) | macro avg | 0.974528 | 0.005782 | 0.000076 |
| 6 | ResNet Tuned (20 Epochs) | weighted avg | 0.975470 | 0.003683 | 0.000050 |
| 7 | ResNet Tuned (20 Epochs) | overall_accuracy | 0.003683 | NaN | NaN |
| 8 | GoogLeNet Untuned (10 Epochs) | accuracy | 0.240638 | 0.240638 | 0.240638 |
| 9 | GoogLeNet Untuned (10 Epochs) | macro avg | 0.357037 | 0.240374 | 0.219307 |
| 10 | GoogLeNet Untuned (10 Epochs) | weighted avg | 0.384768 | 0.240638 | 0.231432 |
| 11 | GoogLeNet Untuned (10 Epochs) | overall_accuracy | 0.240638 | NaN | NaN |
| 12 | GoogLeNet Tuned (20 Epochs) | accuracy | 0.003683 | 0.003683 | 0.003683 |
| 13 | GoogLeNet Tuned (20 Epochs) | macro avg | 0.638079 | 0.002648 | 0.000545 |
| 14 | GoogLeNet Tuned (20 Epochs) | weighted avg | 0.628435 | 0.003683 | 0.000747 |
| 15 | GoogLeNet Tuned (20 Epochs) | overall_accuracy | 0.003683 | NaN | NaN |
hence the Final Model Selected for Test Evaluation:GoogLeNet Untuned Version with 10 Epohs
batch_size=16
# Step 1: Get true labels
y_test_true = np.array([np.argmax(label) for label in df_testing['label_categorical']])
# Step 2: Predict in batches
y_test_pred = []
for i in range(0, len(df_testing), batch_size):
batch_imgs = np.array(df_testing['image'].tolist()[i:i+batch_size])
preds = googlenet_model.predict(batch_imgs, verbose=0)
batch_preds = np.argmax(preds, axis=1)
y_test_pred.extend(batch_preds)
y_test_pred = np.array(y_test_pred)
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
final_googlenet_untuned_report = classification_report(
y_test_true, y_test_pred,
target_names=target_names,
output_dict=True,
zero_division=1 # Avoid divide-by-zero errors
)
df_final_googlenet_untuned_report = pd.DataFrame(final_googlenet_untuned_report).transpose()
acc = accuracy_score(y_test_true, y_test_pred)
df_final_googlenet_untuned_report.loc["overall_accuracy"]= [acc, None, None, None]
df_final_googlenet_untuned_report.to_csv("df_final_googlenet_untuned_classification_report.csv")
print(f"Final GoogleNet(Untuned) against test data Accuracy: {acc:.4f}")
print("Final Untuned GoogleNet metrics against test:")
print(df_final_googlenet_untuned_report.tail(3)[["precision", "recall", "f1-score"]])
Final GoogleNet(Untuned) against test data Accuracy: 0.2406
Final Untuned GoogleNet metrics against test:
precision recall f1-score
macro avg 0.340414 0.239268 0.229648
weighted avg 0.340646 0.240642 0.229778
overall_accuracy 0.240642 NaN NaN
cm = confusion_matrix(y_true, y_pred)
df_support = df_final_googlenet_untuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Tuned GoogLeNet - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()